|
Hi All,
I have approx 4600 cases and I am getting a warning when I try to cluster my dataset into 5 groups. The wierd thing is that I did a similar procedure for approx. the same number of cases, same data values, for two other years. In short, I am trying to cluster cases based on 2 years of data. One process worked, the exact same process for two other years did not work, with the error that there are too few cases. I am stumped. Any ideas? - Brock ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
How many variables are you clustering on? Could it be that there is a missingness patterns such that there are not enough cases that have complete data for all variables involved? Do you have any filters on?
Taylor From: SPSSX(r) Discussion on behalf of Brock Sent: Thu 7/16/2009 5:09 PM To: [hidden email] Subject: KMEANS Cluster - Not Enough Cases Hi All, |
|
In reply to this post by Brock-15
Thanks for getting back to me. Sorry for the delayed response (I am on
vacation). I am clustering on 14 variables. I knew that missing data could be an issue, so I selected to exclude cases pairwise figuring that would get around it, and it did for my first cluster exercise. Missing data are a problem because some cases shouldn't have data in some variables, but I dont want to exclude them. Obviously it is tough to show you without supplying my dataset (which I cannot do), but what I find odd is that my process was successfully on theoretically the same sample. In short, I am creating cluster centers based on data for the previous two years and assigning new cases to the clusters. I want to update my cluster centers with data from the previous two years, so for every year of new data, I am creating new cluster centers on the past two years. I was successful for years (Y2 and Y3), but get the error message when I try to create new cluster centers for (Y1 and Y2). I find it hard that distributions of my data changed and that missing data are a problem for one set of data and not the other, especially when the data generating process are the same. However, if this error message is the result of missing data only, then I will have to figure out something else. Many thanks, Brock ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
