Oblique Principal Component Cluster Analysis

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Oblique Principal Component Cluster Analysis

Sourav Basu
Hi All,

I am new to this group.
I am facing a problem where I have to perform ‘Oblique Principal Component
Cluster Analysis’ in SPSS. In SAS, I can have the result using PROC
VALCLUS procedure; however I have to use the SPSS tool to perform the same
analysis.

The brief of PROC VALCLUS process in SAS is as follows:

The VARCLUS procedure divides a set of numeric variables into disjoint or
hierarchical clusters. Associated with each cluster is a linear
combination of the variables in the cluster. This linear combination can
be either the first principal component (the default) or the centroid
component (if CENTROID option is specified). The first principal component
is a weighted average of the variables that explains as much variance as
possible.
In the VARCLUS procedure, each cluster component is computed from a
different set of variables than all the other cluster components. The
first principal component of one cluster might be correlated with the
first principal component of another cluster. Hence, the VARCLUS algorithm
is a type of oblique component analysis.

Please let me know how to perform the analysis in SPSS. Your input will be
highly appreciated.

Thanks & Regards,
Sourav Basu

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Oblique Principal Component Cluster Analysis

Art Kendall
It is possible to use CLUSTER to cluster items (variables) if you
specify in PROXIMITIES that the similarities distances are between
variables.
If this is a class exercise or if you just want to see the details of
what someone else did, just use PROXIMITIES using whatever coefficient
(most likely a Pearson Correlation) and then CLUSTER.

Although it is possible to cluster items, it is a rather unusual way to
do an analysis. What is the reason for clustering variables? Are the
variables originally items in a scale? Are you using an arbitrary set of
variables to see if some of them might be used as scales.?

What distance/similarity measure between variables do you want to use?

A first principal component explains as much as possible of the total
variance. It is much more common to be interested in the variance that
is common to the set, pooling unique item variance with "error" or
residual variance.

A set of scores (scales) loses divergent validity when some item or part
of an item contributes to more than one scale (score, factor).

Art Kendall
Social Research Consultants

On 10/22/2010 6:54 AM, Sourav Basu wrote:

> Hi All,
>
> I am new to this group.
> I am facing a problem where I have to perform ‘Oblique Principal Component
> Cluster Analysis’ in SPSS. In SAS, I can have the result using PROC
> VALCLUS procedure; however I have to use the SPSS tool to perform the same
> analysis.
>
> The brief of PROC VALCLUS process in SAS is as follows:
>
> The VARCLUS procedure divides a set of numeric variables into disjoint or
> hierarchical clusters. Associated with each cluster is a linear
> combination of the variables in the cluster. This linear combination can
> be either the first principal component (the default) or the centroid
> component (if CENTROID option is specified). The first principal component
> is a weighted average of the variables that explains as much variance as
> possible.
> In the VARCLUS procedure, each cluster component is computed from a
> different set of variables than all the other cluster components. The
> first principal component of one cluster might be correlated with the
> first principal component of another cluster. Hence, the VARCLUS algorithm
> is a type of oblique component analysis.
>
> Please let me know how to perform the analysis in SPSS. Your input will be
> highly appreciated.
>
> Thanks&  Regards,
> Sourav Basu
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Oblique Principal Component Cluster Analysis

Sourav Basu
In reply to this post by Sourav Basu
Hi Dr. Kendall,

Thanks a lot for your mail.

I will try to answer your questions at first. Answers are as follows:

1. Yes, you are right. This is a class work where I have to recreate my
classmate's result(from SAS) only using different tool(SPSS).
2. We are performing cluster analysis on variables.
3. The variables are arbitary in nature.
4. Distance measure will be used in SPSS as per their use in SAS. i.e. if
in SAS the result is created using Euclidean distance then to recreate the
same result in SPSS we will use the Euclidean distance or if it is
Mahalanobis distance in SAS then same will be in SPSS and so on.

The cluster analysis which has been performed earlier, is using VARCLUS in
SAS and we need to rcreate the same reult using SPSS.
Varclus 1st perform the Oblique Principal Component analysis on the raw
dataset and create the PC scores then those new one are used to perform
the cluser analysis. To get the detailed description you can refer the
VARCLUS section of the SAS Stat User guide available on net
(http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/view
er.htm#varclus_toc.htm).
We have tried to perform the analysis in parts as well i.e. 1st perform
the Oblique Principal Component Analysis, then perform the Cluster
Analysis, however; we have not obtained the similar result yet.

This will be really great if you can share your thoughts on the same.

Thanks & regards,
Sourav Basu

On Fri, 22 Oct 2010 13:07:27 -0400, Art Kendall <[hidden email]> wrote:

>It is possible to use CLUSTER to cluster items (variables) if you
>specify in PROXIMITIES that the similarities distances are between
>variables.
>If this is a class exercise or if you just want to see the details of
>what someone else did, just use PROXIMITIES using whatever coefficient
>(most likely a Pearson Correlation) and then CLUSTER.
>
>Although it is possible to cluster items, it is a rather unusual way to
>do an analysis. What is the reason for clustering variables? Are the
>variables originally items in a scale? Are you using an arbitrary set of
>variables to see if some of them might be used as scales.?
>
>What distance/similarity measure between variables do you want to use?
>
>A first principal component explains as much as possible of the total
>variance. It is much more common to be interested in the variance that
>is common to the set, pooling unique item variance with "error" or
>residual variance.
>
>A set of scores (scales) loses divergent validity when some item or part
>of an item contributes to more than one scale (score, factor).
>
>Art Kendall
>Social Research Consultants
>
>On 10/22/2010 6:54 AM, Sourav Basu wrote:
>> Hi All,
>>
>> I am new to this group.
>> I am facing a problem where I have to perform ‘Oblique Principal
Component
>> Cluster Analysis’ in SPSS. In SAS, I can have the result using PROC
>> VALCLUS procedure; however I have to use the SPSS tool to perform the
same
>> analysis.
>>
>> The brief of PROC VALCLUS process in SAS is as follows:
>>
>> The VARCLUS procedure divides a set of numeric variables into disjoint
or
>> hierarchical clusters. Associated with each cluster is a linear
>> combination of the variables in the cluster. This linear combination can
>> be either the first principal component (the default) or the centroid
>> component (if CENTROID option is specified). The first principal
component
>> is a weighted average of the variables that explains as much variance as
>> possible.
>> In the VARCLUS procedure, each cluster component is computed from a
>> different set of variables than all the other cluster components. The
>> first principal component of one cluster might be correlated with the
>> first principal component of another cluster. Hence, the VARCLUS
algorithm
>> is a type of oblique component analysis.
>>
>> Please let me know how to perform the analysis in SPSS. Your input will
be

>> highly appreciated.
>>
>> Thanks&  Regards,
>> Sourav Basu
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD