McIntyre & Blashfield's validation technique: cluster analysis

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

McIntyre & Blashfield's validation technique: cluster analysis

Liza Rovniak
Hi,

 

I am attempting to run McIntrye and Blashfield's (1980) nearest-centroid
evaluation procedure to validate the stability of my cluster analysis
solution. I am a newbie to cluster analysis, so this is my first time
running this procedure (I've just figured out how to conduct
hierarchical and k-means cluster analysis).

 

I have a sample of around 800 observations and have randomly split the
sample in two (Sample A and Sample B), cluster analyzed each of the two
subsamples with hierarchical cluster analysis, and calculated the
centroid vectors for each of these two subsamples. This takes me through
steps 1 through 4 of McIntrye and Blashfield's evaluation technique.

 

Step 5 of McIntrye and Blashfield's technique is to calculate "the
squared Euclidean distance for each of Sample B's objects from each of
the centroids of Sample A," and Step 6 is to assign "each object  in
Sample B to the closest centroid vector." At this point, I am not sure
how to proceed (i.e., what buttons to press in SPSS). I considered using
K-means cluster analysis to achieve this step, but K-means uses simple
Euclidean distance (not squared Euclidean distance as recommended by
McIntyre and Blashfield) to assign the observations to clusters. If
anybody is familiar with this procedure and could direct me on what
buttons to press in SPSS to complete this analysis, I would greatly
appreciate it.

 

Thank you.

 

Liza Rovniak

 

Liza S. Rovniak, PhD, MPH

Adjunct Assistant Professor

Center for Behavioral Epidemiology & Community Health

Graduate School of Public Health, San Diego State University

San Diego, CA 92123

Email: [hidden email]

 

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: McIntyre & Blashfield's validation technique: cluster analysis

Fry, Jonathan B.
Liza,

The closest centroid by Euclidean distance will also be the closest centroid by squared Euclidean distance, so that difference should not stop you from using K Means.  If you need the squared distances for later calculations, you can compute them easily from the distances the procedure writes.

Jonathan Fry

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Liza Rovniak
Sent: Wednesday, September 03, 2008 3:12 PM
To: [hidden email]
Subject: McIntyre & Blashfield's validation technique: cluster analysis

Hi,



I am attempting to run McIntrye and Blashfield's (1980) nearest-centroid
evaluation procedure to validate the stability of my cluster analysis
solution. I am a newbie to cluster analysis, so this is my first time
running this procedure (I've just figured out how to conduct
hierarchical and k-means cluster analysis).



I have a sample of around 800 observations and have randomly split the
sample in two (Sample A and Sample B), cluster analyzed each of the two
subsamples with hierarchical cluster analysis, and calculated the
centroid vectors for each of these two subsamples. This takes me through
steps 1 through 4 of McIntrye and Blashfield's evaluation technique.



Step 5 of McIntrye and Blashfield's technique is to calculate "the
squared Euclidean distance for each of Sample B's objects from each of
the centroids of Sample A," and Step 6 is to assign "each object  in
Sample B to the closest centroid vector." At this point, I am not sure
how to proceed (i.e., what buttons to press in SPSS). I considered using
K-means cluster analysis to achieve this step, but K-means uses simple
Euclidean distance (not squared Euclidean distance as recommended by
McIntyre and Blashfield) to assign the observations to clusters. If
anybody is familiar with this procedure and could direct me on what
buttons to press in SPSS to complete this analysis, I would greatly
appreciate it.



Thank you.



Liza Rovniak



Liza S. Rovniak, PhD, MPH

Adjunct Assistant Professor

Center for Behavioral Epidemiology & Community Health

Graduate School of Public Health, San Diego State University

San Diego, CA 92123

Email: [hidden email]



=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD