Re: Fitindices to determine optimal clustersolution
Posted by
Jon K Peck on
Apr 29, 2015; 5:16pm
URL: http://spssx-discussion.165.s1.nabble.com/Fitindices-to-determine-optimal-clustersolution-tp5729419p5729438.html
One tool that might give you added insight
into your clustering solutions is cluster silhouettes. These show
the distribution of silhouette values for each cluster. They can
be produced by the STATS CLUS SIL extension command (Analyze > Classify
> Cluster Silhouettes). If you don't have that already installed
and have V22 or later, you can install it from the Utilities menu. For
older versions you would need to get it from the SPSS Community website
(www.ibm.com/developerworks/spssdevcentral)
in the Extension Commands collection. It requires the Python Essentials,
which are integrated into the Statistics install as of V22.
Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621
From:
MaaikeSmits <[hidden email]>
To:
[hidden email]
Date:
04/29/2015 11:06 AM
Subject:
Re: [SPSSX-L]
Fitindices to determine optimal clustersolution
Sent by:
"SPSSX(r)
Discussion" <[hidden email]>
Hello,
Thank you for taking interest in my question. I will
try to provide you with additional information on your questions.
From a total of 225 cases, 187 were included in the cluster
analysis (20 cases were lost as a result of missing data on one or more
of the 10 input variables and another 18 were excluded because they
showed to be extreme outliers on one or more of the input variables).
I started with a hierarchical cluster analysis on this
187 cases and the cluster means that resulted from this procedure were
used as non-random starting points in the k-means cluster analysis, which
was also done on these same 187 cases. So, I did not select subsamples
for the hierarchical nor the k-means procedure, but ran both on the whole
sample.
The 10 (standardized) dimensional scores that were used
as input variables for the cluster analysis were fairly unrelated, most
below .1, a few of .3 or .4.
I hope I have given you the relevant answers to be able
to provide some guidance on my question. Of course I will be happy to provide
more detailed information if necessary.
Kind Regards
Maaike
2015-04-29 17:08 GMT+02:00 Art Kendall [via SPSSX Discussion]
<[hidden
email]>:
How many cases do you have in the whole data set?
How were the cases selected?
Are you variables reasonably uncorrelated?
Am I reading correctly that you used the cluster profiles from the Ward
method to start the k-means?
How many samples from the whole set of cases did you use for the Ward method?
How large were those samples?
Art Kendall
Social Research Consultants
If you reply to this email, your message
will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Fitindices-to-determine-optimal-clustersolution-tp5729419p5729431.html
To unsubscribe from Fitindices to determine
optimal clustersolution, click
here.
NAML
View this message in context: Re:
Fitindices to determine optimal clustersolution
Sent from the SPSSX
Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message
to LISTSERV@...
(not to SPSSX-L), with no body text except the command. To leave the list,
send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions,
send the command INFO REFCARD
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD