Hello Friends,
I'm facing a problem, with Hierarchical clustering. If i change the order of the variable (keeping the same variables), I get different types of solution. E.g. if I run this syntax CLUSTER v1 v2 v3 v4 /METHOD WARD /MEASURE=SEUCLID /PRINT SCHEDULE /PLOT NONE /SAVE CLUSTER(3). or CLUSTER v3 v4 v1 v2 /METHOD WARD /MEASURE=SEUCLID /PRINT SCHEDULE /PLOT NONE /SAVE CLUSTER(3). ..i get different cluster solution. Is there some step I'm missing to do or is it typical and that's why we run a k-means after hierarchical segmentation to stabilize the outputs. Any fast reply/ help would be much appreciated. Thanks. Das. |
Yes, the hierarchical clustering solution can depend on case order: " Case order. If tied distances or similarities exist in the input data or occur among updated clusters during joining, the resulting cluster solution may depend on the order of cases in the file. You may want to obtain several different solutions with cases sorted in different random orders to verify the stability of a given solution. " The k-means solution can also depend on case order: " Case and initial cluster center order. The default algorithm for choosing initial cluster centers is not invariant to case ordering. The Use running means option in the Iterate dialog box makes the resulting solution potentially dependent on case order, regardless of how initial cluster centers are chosen. If you are using either of these methods, you may want to obtain several different solutions with cases sorted in different random orders to verify the stability of a given solution. Specifying initial cluster centers and not using the Use running means option will avoid issues related to case order. However, ordering of the initial cluster centers may affect the solution if there are tied distances from cases to cluster centers. To assess the stability of a given solution, you can compare results from analyses with different permutations of the initial center values. " Alex
Hello Friends, I'm facing a problem, with Hierarchical clustering. If i change the order of the variable (keeping the same variables), I get different types of solution. E.g. if I run this syntax CLUSTER v1 v2 v3 v4 /METHOD WARD /MEASURE=SEUCLID /PRINT SCHEDULE /PLOT NONE /SAVE CLUSTER(3). or CLUSTER v3 v4 v1 v2 /METHOD WARD /MEASURE=SEUCLID /PRINT SCHEDULE /PLOT NONE /SAVE CLUSTER(3). ..i get different cluster solution. Is there some step I'm missing to do or is it typical and that's why we run a k-means after hierarchical segmentation to stabilize the outputs. Any fast reply/ help would be much appreciated. Thanks. Das. |
In reply to this post by rdas
My apologies! Please ignore; I completely misread the original post. Changing the order of variables changing the solution seems unusual indeed. Alex
Yes, the hierarchical clustering solution can depend on case order: " Case order. If tied distances or similarities exist in the input data or occur among updated clusters during joining, the resulting cluster solution may depend on the order of cases in the file. You may want to obtain several different solutions with cases sorted in different random orders to verify the stability of a given solution. " The k-means solution can also depend on case order: " Case and initial cluster center order. The default algorithm for choosing initial cluster centers is not invariant to case ordering. The Use running means option in the Iterate dialog box makes the resulting solution potentially dependent on case order, regardless of how initial cluster centers are chosen. If you are using either of these methods, you may want to obtain several different solutions with cases sorted in different random orders to verify the stability of a given solution. Specifying initial cluster centers and not using the Use running means option will avoid issues related to case order. However, ordering of the initial cluster centers may affect the solution if there are tied distances from cases to cluster centers. To assess the stability of a given solution, you can compare results from analyses with different permutations of the initial center values. " Alex
Hello Friends, I'm facing a problem, with Hierarchical clustering. If i change the order of the variable (keeping the same variables), I get different types of solution. E.g. if I run this syntax CLUSTER v1 v2 v3 v4 /METHOD WARD /MEASURE=SEUCLID /PRINT SCHEDULE /PLOT NONE /SAVE CLUSTER(3). or CLUSTER v3 v4 v1 v2 /METHOD WARD /MEASURE=SEUCLID /PRINT SCHEDULE /PLOT NONE /SAVE CLUSTER(3). ..i get different cluster solution. Is there some step I'm missing to do or is it typical and that's why we run a k-means after hierarchical segmentation to stabilize the outputs. Any fast reply/ help would be much appreciated. Thanks. Das. |
Free forum by Nabble | Edit this page |