|
I have run two k-mean cluster analysis with the same set of numbers using
two different SPSS spreadsheets and I get a different cluster groupings.Ihave specified the same number of clusters and the values are identical in both spreadsheets. Does anyone have any insight that could help me understand this outcome? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Jessica,
K-means proceeds by iteration from an arbitrary initial set of cluster centers. Different initial cluster centers might lead to different final solutions, especially for "borderline" or atypical cases not clearly belonging in one cluster or another. One possibility for your case is that different initial cluster centers were used in each occasion. If you did not specify a matrix of initial cluster centers, SPSS somehow assigned initial cluster centers to start the iteration process. These initial centers might be the first k cases, or some other combination of values of the variables for the k clusters. Perhaps you inadvertently altered the mode of selection of initial cluster centers from one run to the other. Remember also that clustering is a heuristic procedure admitting many possible solutions. According to the starting point of the iteration, some borderline cases might end up in one cluster or another. Some other parameters of the procedure might have been altered too. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jessica Ashley Sent: 18 March 2008 21:55 To: [hidden email] Subject: Cluster Analysis Question I have run two k-mean cluster analysis with the same set of numbers using two different SPSS spreadsheets and I get a different cluster groupings.Ihave specified the same number of clusters and the values are identical in both spreadsheets. Does anyone have any insight that could help me understand this outcome? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
K-means cluster analysis and two-stage cluster analysis both depend
upon the sequence of observations in the dataset. In each method, different data orders usually yield different outcomes. Randomization of cases is recommended. The smaller the dataset, the greater the problem of data order even with randomization. Small datasets can yield strikingly different cluster patterns for different data sequences. A recommended strategy is to couple randomization with multiple runs as a form of stability analysis to establish that the clusters are stable across different random orderings. Dave G = ____________________________________ G. David Garson Editor, Social Science Computer Review NCSU Box 8102 Department of Political Science and Public Administration North Carolina State University Raleigh, NC 27695-8102 For UPS and Express Mail: G. David Garson 212 Caldwell Hall / PSPA Hillsborough Street North Carolina State University Raleigh, NC 27695-8102 Tel. 919-515-3067 Fax 919-515-7333 E-mail: [hidden email] __________________________________________________________ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
