SPSSX Discussion

cluster analysis

Classic

List

Threaded

3 messages Options

Ole Rohwer

cluster analysis

Dear List,
I´ve got a large crosstable with a lot of zero cases, some 80 to 85
percent. Is there a minimum number of cases neccessary to run two step
or clustercenter analysis ?

Regards
Ole

Hector Maletta

Re: cluster analysis

Clustering is a nonparametric classificatory device for which statistical
significance is of secondary relevance. By "secondary" I mean that once you
have your clusters you may test the hypothesis that they differ (or do not
differ) in some variable of interest, and only then the number of cases
would be important, to determine the significance of the difference in order
and thus accept or reject the null hypothesis. But for forming the clusters
themselves this plays no role. Just two cases are enough to define two
clusters, although that would be a bit silly.
Besides that, the fact that your cross tabulation shows many empty cells
tends to suggests there is association between the variables, i.e. cases
tend to cluster in cells representing certain combinations of values of the
variables, and not in others. This would suggest that a cluster analysis
would be meaningful.
However, that is not the usual reason why one may want to do cluster
analysis. What is your purpose? Are you trying to build a multivariate
typology, based on the various dimensions of a complex concept, and thus
classifying your cases in various types or groups? Are you trying to
identify "odd" groups of cases with unusual combinations of values? (The
latter you may do directly in your cross table).
From your message it seems all your variables are categorical, and the
combined number of categories not too large, since they lend themselves to
be cross-tabulated in a single table, albeit somewhat large. There are
things you can do directly there, without going into other procedures. There
are also other statistical procedures that may be in order, depending on
your purpose.
Perhaps, therefore, you should consider alternatives to cluster analysis,
and the reasons why you may want to use or not to use it.
Hector

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Ole
Rohwer
Enviado el: Saturday, August 12, 2006 9:17 AM
Para: [hidden email]
Asunto: cluster analysis

Dear List,
I´ve got a large crosstable with a lot of zero cases, some 80 to 85
percent. Is there a minimum number of cases neccessary to run two step
or clustercenter analysis ?

Regards
Ole

Nana Nadine

Data imputation

Good morning Keith,
I hope you are doing fine. I would like you to please help me with this
I have two surveys, one is the longer version of the other one. The goal was to reduce the legnth of the survey and use answer of Var1.....Var6 and their answer on Var7 & Var8 to identify people that look like them on the other survey and impute the value of Var7 & Var8.
Below is a king of representation of what I'm talking about.
Is there a way to use SPSS to do this type of imputation?
I tried linear regression and the parameters were not good.

ID Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 1 5 2 5 5 3 5 4 ? 2 4 4 4 4 4 4 1 ? 3 3 3 3 3 3 3 2 ? 4 5 5 5 5 5 5 3 ? 5 4 4 4 4 4 4 4 ? 6 3 3 3 3 3 3 2 ? 7 4 4 4 4 4 4 3 ? 8 3 3 3 3 3 3 1 ? 9 2 2 2 2 2 2 4 ? 10 5 5 5 5 5 5 3 ? 11 4 4 2 4 3 4 ? 4 12 3 3 3 3 3 3 ? 1 13 3 3 3 3 3 3 ? 2 14 2 2 2 2 2 2 ? 3 15 2 3 1 4 1 5 ? 4 16 5 3 5 5 1 5 ? 2 17 4 4 4 4 4 4 ? 3 18 3 3 3 3 3 3 ? 1 19 2 2 2 2 2 2 ? 4 20 2 3 1 4 1 5 ? 3

Anyone have suggestion ?
Thank you in advance.
Sincerely

Nadine Nana, MS
Research Data Manager
Atlanta GA
404 805 6209

---------------------------------
Do you Yahoo!?
Next-gen email? Have it all with the all-new Yahoo! Mail Beta.