Cluster analysis procedures in SPSS

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Cluster analysis procedures in SPSS

Johnny Amora
There are three Cluster Analysis(CA) procedures in
SPSS (K-means CA, Hierarchical CA, and two-step CA).
Do all such procedures require that the variables
should be uncorrelated?  Thanks

John


      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis procedures in SPSS

Hector Maletta
Not at all. None requires anything about the correlation (or lack thereof)
of the variables involved. Of course, if all the variables are perfectly or
almost perfectly correlated the analysis would be useless. In other words:
If you use a set of variables that are very closely correlated to each
other, they would be redundant in the clustering procedure: you might use
any of them individually (or a smaller subset) and obtain pretty much the
same results.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John Amora
Sent: 25 March 2008 00:36
To: [hidden email]
Subject: Cluster analysis procedures in SPSS

There are three Cluster Analysis(CA) procedures in
SPSS (K-means CA, Hierarchical CA, and two-step CA).
Do all such procedures require that the variables
should be uncorrelated?  Thanks

John



____________________________________________________________________________
________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis procedures in SPSS

Swank, Paul R
However, if one uses Euclidean distance as a measure of dissimilarity,
this assumes that the variables are independent.

Paul R. Swank, Ph.D.
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center - Houston


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: Monday, March 24, 2008 11:09 PM
To: [hidden email]
Subject: Re: Cluster analysis procedures in SPSS

Not at all. None requires anything about the correlation (or lack
thereof)
of the variables involved. Of course, if all the variables are perfectly
or
almost perfectly correlated the analysis would be useless. In other
words:
If you use a set of variables that are very closely correlated to each
other, they would be redundant in the clustering procedure: you might
use
any of them individually (or a smaller subset) and obtain pretty much
the
same results.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John Amora
Sent: 25 March 2008 00:36
To: [hidden email]
Subject: Cluster analysis procedures in SPSS

There are three Cluster Analysis(CA) procedures in
SPSS (K-means CA, Hierarchical CA, and two-step CA).
Do all such procedures require that the variables
should be uncorrelated?  Thanks

John



________________________________________________________________________
____
________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis procedures in SPSS

Hector Maletta
Paul,
I am not sure of your contention. Suppose you have only two variables X and
Z, with points A, B, C,... Distances between the points can always be
measured by Euclidean distance, irrespective of the points' distribution:
they may lie along a straight line (i.e. linearly correlated) or in an
amorphous cloud of uncorrelated points, or in any intermediate situation.
The same holds for any number of variables in multidimensional space.

Hector
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Swank, Paul R
Sent: 25 March 2008 14:45
To: [hidden email]
Subject: Re: Cluster analysis procedures in SPSS

However, if one uses Euclidean distance as a measure of dissimilarity,
this assumes that the variables are independent.

Paul R. Swank, Ph.D.
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center - Houston


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: Monday, March 24, 2008 11:09 PM
To: [hidden email]
Subject: Re: Cluster analysis procedures in SPSS

Not at all. None requires anything about the correlation (or lack
thereof)
of the variables involved. Of course, if all the variables are perfectly
or
almost perfectly correlated the analysis would be useless. In other
words:
If you use a set of variables that are very closely correlated to each
other, they would be redundant in the clustering procedure: you might
use
any of them individually (or a smaller subset) and obtain pretty much
the
same results.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John Amora
Sent: 25 March 2008 00:36
To: [hidden email]
Subject: Cluster analysis procedures in SPSS

There are three Cluster Analysis(CA) procedures in
SPSS (K-means CA, Hierarchical CA, and two-step CA).
Do all such procedures require that the variables
should be uncorrelated?  Thanks

John



________________________________________________________________________
____
________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis procedures in SPSS

Swank, Paul R
Euclidean distance does not take into account the angle of the axes. It
assumes they are right angles. Mahalanobis distance is appropriate for
measuring distance when the axes are not at right angles, ie. The
dimensions are correlated.

Paul

Paul R. Swank, Ph.D.
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center - Houston


-----Original Message-----
From: Hector Maletta [mailto:[hidden email]]
Sent: Tuesday, March 25, 2008 2:09 PM
To: Swank, Paul R; [hidden email]
Subject: RE: Cluster analysis procedures in SPSS

Paul,
I am not sure of your contention. Suppose you have only two variables X
and
Z, with points A, B, C,... Distances between the points can always be
measured by Euclidean distance, irrespective of the points'
distribution:
they may lie along a straight line (i.e. linearly correlated) or in an
amorphous cloud of uncorrelated points, or in any intermediate
situation.
The same holds for any number of variables in multidimensional space.

Hector
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Swank, Paul R
Sent: 25 March 2008 14:45
To: [hidden email]
Subject: Re: Cluster analysis procedures in SPSS

However, if one uses Euclidean distance as a measure of dissimilarity,
this assumes that the variables are independent.

Paul R. Swank, Ph.D.
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center - Houston


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: Monday, March 24, 2008 11:09 PM
To: [hidden email]
Subject: Re: Cluster analysis procedures in SPSS

Not at all. None requires anything about the correlation (or lack
thereof)
of the variables involved. Of course, if all the variables are perfectly
or
almost perfectly correlated the analysis would be useless. In other
words:
If you use a set of variables that are very closely correlated to each
other, they would be redundant in the clustering procedure: you might
use
any of them individually (or a smaller subset) and obtain pretty much
the
same results.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John Amora
Sent: 25 March 2008 00:36
To: [hidden email]
Subject: Cluster analysis procedures in SPSS

There are three Cluster Analysis(CA) procedures in
SPSS (K-means CA, Hierarchical CA, and two-step CA).
Do all such procedures require that the variables
should be uncorrelated?  Thanks

John



________________________________________________________________________
____
________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis procedures in SPSS

David Hitchin
Quoting "Swank, Paul R" <[hidden email]>:

> Euclidean distance does not take into account the angle of the axes.
> It
> assumes they are right angles. Mahalanobis distance is appropriate
> for
> measuring distance when the axes are not at right angles, ie. The
> dimensions are correlated.
>

Any data set can be transformed to orthogonal axes by transforming it to
its principal components. However, the Euclidean distances between
points are unchanged by this transformation.

[Note. "Principal components analysis" as a pure rotation preserving
distances is NOT the same as principal components factor analysis]

David Hitchin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD