Dear List,
I´ve got a large crosstable with a lot of zero cases, some 80 to 85 percent. Is there a minimum number of cases neccessary to run two step or clustercenter analysis ? Regards Ole |
Clustering is a nonparametric classificatory device for which statistical
significance is of secondary relevance. By "secondary" I mean that once you have your clusters you may test the hypothesis that they differ (or do not differ) in some variable of interest, and only then the number of cases would be important, to determine the significance of the difference in order and thus accept or reject the null hypothesis. But for forming the clusters themselves this plays no role. Just two cases are enough to define two clusters, although that would be a bit silly. Besides that, the fact that your cross tabulation shows many empty cells tends to suggests there is association between the variables, i.e. cases tend to cluster in cells representing certain combinations of values of the variables, and not in others. This would suggest that a cluster analysis would be meaningful. However, that is not the usual reason why one may want to do cluster analysis. What is your purpose? Are you trying to build a multivariate typology, based on the various dimensions of a complex concept, and thus classifying your cases in various types or groups? Are you trying to identify "odd" groups of cases with unusual combinations of values? (The latter you may do directly in your cross table). From your message it seems all your variables are categorical, and the combined number of categories not too large, since they lend themselves to be cross-tabulated in a single table, albeit somewhat large. There are things you can do directly there, without going into other procedures. There are also other statistical procedures that may be in order, depending on your purpose. Perhaps, therefore, you should consider alternatives to cluster analysis, and the reasons why you may want to use or not to use it. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Ole Rohwer Enviado el: Saturday, August 12, 2006 9:17 AM Para: [hidden email] Asunto: cluster analysis Dear List, I´ve got a large crosstable with a lot of zero cases, some 80 to 85 percent. Is there a minimum number of cases neccessary to run two step or clustercenter analysis ? Regards Ole |
Good morning Keith,
I hope you are doing fine. I would like you to please help me with this I have two surveys, one is the longer version of the other one. The goal was to reduce the legnth of the survey and use answer of Var1.....Var6 and their answer on Var7 & Var8 to identify people that look like them on the other survey and impute the value of Var7 & Var8. Below is a king of representation of what I'm talking about. Is there a way to use SPSS to do this type of imputation? I tried linear regression and the parameters were not good. ID Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 1 5 2 5 5 3 5 4 ? 2 4 4 4 4 4 4 1 ? 3 3 3 3 3 3 3 2 ? 4 5 5 5 5 5 5 3 ? 5 4 4 4 4 4 4 4 ? 6 3 3 3 3 3 3 2 ? 7 4 4 4 4 4 4 3 ? 8 3 3 3 3 3 3 1 ? 9 2 2 2 2 2 2 4 ? 10 5 5 5 5 5 5 3 ? 11 4 4 2 4 3 4 ? 4 12 3 3 3 3 3 3 ? 1 13 3 3 3 3 3 3 ? 2 14 2 2 2 2 2 2 ? 3 15 2 3 1 4 1 5 ? 4 16 5 3 5 5 1 5 ? 2 17 4 4 4 4 4 4 ? 3 18 3 3 3 3 3 3 ? 1 19 2 2 2 2 2 2 ? 4 20 2 3 1 4 1 5 ? 3 Anyone have suggestion ? Thank you in advance. Sincerely Nadine Nana, MS Research Data Manager Atlanta GA 404 805 6209 --------------------------------- Do you Yahoo!? Next-gen email? Have it all with the all-new Yahoo! Mail Beta. |
Free forum by Nabble | Edit this page |