Cluster Analysis Question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Cluster Analysis Question

Jessica Ashley
I have run two k-mean cluster analysis with the same set of numbers using
two different SPSS spreadsheets and I get a different cluster
groupings.Ihave specified the same number of clusters and the values
are identical in
both spreadsheets. Does anyone have any insight that could help me
understand this outcome?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Analysis Question

Hector Maletta
Jessica,
K-means proceeds by iteration from an arbitrary initial set of cluster
centers. Different initial cluster centers might lead to different final
solutions, especially for "borderline" or atypical cases not clearly
belonging in one cluster or another. One possibility for your case is that
different initial cluster centers were used in each occasion. If you did not
specify a matrix of initial cluster centers, SPSS somehow assigned initial
cluster centers to start the iteration process. These initial centers might
be the first k cases, or some other combination of values of the variables
for the k clusters. Perhaps you inadvertently altered the mode of selection
of initial cluster centers from one run to the other.
Remember also that clustering is a heuristic procedure admitting many
possible solutions. According to the starting point of the iteration, some
borderline cases might end up in one cluster or another.
Some other parameters of the procedure might have been altered too.
Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Jessica Ashley
Sent: 18 March 2008 21:55
To: [hidden email]
Subject: Cluster Analysis Question

I have run two k-mean cluster analysis with the same set of numbers using
two different SPSS spreadsheets and I get a different cluster
groupings.Ihave specified the same number of clusters and the values
are identical in
both spreadsheets. Does anyone have any insight that could help me
understand this outcome?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Analysis Question

G. David Garson
K-means cluster analysis and two-stage cluster analysis both depend
upon the sequence of observations in the dataset. In each method,
different data orders usually yield different outcomes. Randomization
of cases is recommended. The smaller the dataset, the greater the
problem of data order even with randomization. Small datasets can
yield strikingly different cluster patterns for different data
sequences. A recommended strategy is to couple randomization with
multiple runs as a form of stability analysis to establish that the
clusters are stable across different random orderings.


Dave G


=
____________________________________

G. David Garson
Editor, Social Science Computer Review
NCSU Box 8102
Department of Political Science and Public Administration
North Carolina State University
Raleigh, NC 27695-8102

For UPS and Express Mail:
G. David Garson
212 Caldwell Hall / PSPA
Hillsborough Street
North Carolina State University
Raleigh, NC 27695-8102

Tel. 919-515-3067
Fax 919-515-7333
E-mail: [hidden email]



__________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD