KMEANS Cluster - Not Enough Cases

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

KMEANS Cluster - Not Enough Cases

Brock-15
Hi All,

I have approx 4600 cases and I am getting a warning when I try to cluster my
dataset into 5 groups.  The wierd thing is that I did a similar procedure for
approx. the same number of cases, same data values, for two other years.

In short, I am trying to cluster cases based on 2 years of data.  One process
worked, the exact same process for two other years did not work, with the
error that there are too few cases.

I am stumped.  Any ideas?

- Brock

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: KMEANS Cluster - Not Enough Cases

Poling, Taylor Leigh
KMEANS Cluster - Not Enough Cases
How many variables are you clustering on? Could it be that there is a missingness patterns such that there are not enough cases that have complete data for all variables involved? Do you have any filters on?
 
Taylor


From: SPSSX(r) Discussion on behalf of Brock
Sent: Thu 7/16/2009 5:09 PM
To: [hidden email]
Subject: KMEANS Cluster - Not Enough Cases

Hi All,

I have approx 4600 cases and I am getting a warning when I try to cluster my
dataset into 5 groups.  The wierd thing is that I did a similar procedure for
approx. the same number of cases, same data values, for two other years.

In short, I am trying to cluster cases based on 2 years of data.  One process
worked, the exact same process for two other years did not work, with the
error that there are too few cases.

I am stumped.  Any ideas?

- Brock

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: KMEANS Cluster - Not Enough Cases

Brock-15
In reply to this post by Brock-15
Thanks for getting back to me.  Sorry for the delayed response (I am on
vacation).

I am clustering on 14 variables.  I knew that missing data could be an
issue, so I selected to exclude cases pairwise figuring that would get
around it, and it did for my first cluster exercise.  Missing data are a
problem because some cases shouldn't have data in some variables, but I dont
want to exclude them.

Obviously it is tough to show you without supplying my dataset (which I
cannot do), but what I find odd is that my process was successfully on
theoretically the same sample.

In short, I am creating cluster centers based on data for the previous two
years and assigning new cases to the clusters.  I want to update my cluster
centers with data from the previous two years, so for every year of new
data, I am creating new cluster centers on the past two years.  I was
successful for years (Y2 and Y3), but get the error message when I try to
create new cluster centers for (Y1 and Y2).

I find it hard that distributions of my data changed and that missing data
are a problem for one set of data and not the other, especially when the
data generating process are the same. However, if this error message is the
result of missing data only, then I will have to figure out something else.

Many thanks,

Brock

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD