Advantages of cluster analysis over factor analysis?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Advantages of cluster analysis over factor analysis?

Tanja Gabriele Baudson
Hi,

I submitted a paper where I had used a cluster analysis (Ward's method and
k-means) to structure a questionnaire. However, this was rejected by one
reviewer who did not find my approach convincing (factor analysis is more
common indeed). The journal addresses practitioners, therefore I preferred
the clear-cut 7-cluster solution of the cluster analysis to the 17-factor
result from the cluster analysis with strong crossloadings (the
questionnaire concerns a certain subgroup of students, therefore this might
probably be expected; the many mini factors consisting of two or three items
only suck, though, and eliminating them might shorten the questionnaire
significantly ...).

So, are there any arguments about the statistical advantages of CA over FA
which might help convince the reviewer? :)

Thanks in advance
Tanya

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Advantages of cluster analysis over factor analysis?

Mustafa Ozkaynak
Hello,
I think what determines your data analysis strategy is the research question on hand.
What is your research question?


On Fri, Aug 20, 2010 at 10:47 AM, Tanya <[hidden email]> wrote:
Hi,

I submitted a paper where I had used a cluster analysis (Ward's method and
k-means) to structure a questionnaire. However, this was rejected by one
reviewer who did not find my approach convincing (factor analysis is more
common indeed). The journal addresses practitioners, therefore I preferred
the clear-cut 7-cluster solution of the cluster analysis to the 17-factor
result from the cluster analysis with strong crossloadings (the
questionnaire concerns a certain subgroup of students, therefore this might
probably be expected; the many mini factors consisting of two or three items
only suck, though, and eliminating them might shorten the questionnaire
significantly ...).

So, are there any arguments about the statistical advantages of CA over FA
which might help convince the reviewer? :)

Thanks in advance
Tanya

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Mustafa Ozkaynak

><((((º>`·.¸¸.·´¯`·.¸.·´¯`·...¸><((((º>¸.
`·.¸¸.·´¯`·.¸.·´¯`·...¸><((((º>`·.¸¸.·´¯`·.¸.·´¯`·...¸><((((º>
Reply | Threaded
Open this post in threaded view
|

Re: Advantages of cluster analysis over factor analysis?

Art Kendall
In reply to this post by Tanja Gabriele Baudson
  How did you decide the number of factors to retain? It sounds as if
you used the Kaiser criterion. I suggest that you try the conventional
approach to factor analysis before you dismiss it as a technique.

How many scales were the items designed to measure? or is this a
strictly an ad hoc factor analysis?

search the archives at
http://listserv.uga.edu/archives/spssx-l.html
for "parallel analysis".

When you redo the factor analysis with fewer factors and find scales
that are meaningful and have only cleanly loading items, what do the
scale reliabilities look like?

How do the scaling keys compare from the cluster vs factor approach?

Art Kendall
Social Research Consultants

On 8/20/2010 11:47 AM, Tanya wrote:

> Hi,
>
> I submitted a paper where I had used a cluster analysis (Ward's method and
> k-means) to structure a questionnaire. However, this was rejected by one
> reviewer who did not find my approach convincing (factor analysis is more
> common indeed). The journal addresses practitioners, therefore I preferred
> the clear-cut 7-cluster solution of the cluster analysis to the 17-factor
> result from the cluster analysis with strong crossloadings (the
> questionnaire concerns a certain subgroup of students, therefore this might
> probably be expected; the many mini factors consisting of two or three items
> only suck, though, and eliminating them might shorten the questionnaire
> significantly ...).
>
> So, are there any arguments about the statistical advantages of CA over FA
> which might help convince the reviewer? :)
>
> Thanks in advance
> Tanya
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Advantages of cluster analysis over factor analysis?

G David Garson
In reply to this post by Mustafa Ozkaynak
A key difference is that cluster analysis is dealing with proximities.
Thus correlation as a proximity measure is employed as a whole
coefficient. Factor analysis employs partial coefficients and takes
controlled relationships into account. FA can indeed yield overly
complex results. That is one of several reasons why confirmatory factor
analysis in SEM would be recommended by many in this situation. In
statistics as in religion, however, there are many paths to heaven.
DG


Mustafa Ozkaynak wrote:

> Hello,
> I think what determines your data analysis strategy is the research
> question on hand.
> What is your research question?
>
>
> On Fri, Aug 20, 2010 at 10:47 AM, Tanya <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi,
>
>     I submitted a paper where I had used a cluster analysis (Ward's
>     method and
>     k-means) to structure a questionnaire. However, this was rejected
>     by one
>     reviewer who did not find my approach convincing (factor analysis
>     is more
>     common indeed). The journal addresses practitioners, therefore I
>     preferred
>     the clear-cut 7-cluster solution of the cluster analysis to the
>     17-factor
>     result from the cluster analysis with strong crossloadings (the
>     questionnaire concerns a certain subgroup of students, therefore
>     this might
>     probably be expected; the many mini factors consisting of two or
>     three items
>     only suck, though, and eliminating them might shorten the
>     questionnaire
>     significantly ...).
>
>     So, are there any arguments about the statistical advantages of CA
>     over FA
>     which might help convince the reviewer? :)
>
>     Thanks in advance
>     Tanya
>
>     =====================
>     To manage your subscription to SPSSX-L, send a message to
>     [hidden email] <mailto:[hidden email]> (not
>     to SPSSX-L), with no body text except the
>     command. To leave the list, send the command
>     SIGNOFF SPSSX-L
>     For a list of commands to manage subscriptions, send the command
>     INFO REFCARD
>
>
>
>
> --
> Mustafa Ozkaynak
>
> ><((((º>`·.¸¸.·´¯`·.¸.·´¯`·...¸><((((º>¸.
> `·.¸¸.·´¯`·.¸.·´¯`·...¸><((((º>`·.¸¸.·´¯`·.¸.·´¯`·...¸><((((º>

--

_________________________________________________________

G. David Garson
School of Public and International Affairs
North Carolina State University, Campus Box #8102
Raleigh, NC 27695-8102

For Fedex and other express mail add:
212 Caldwell Hall, Hillsborough Street

Tel. 1-919-515-3067
Fax: 1-919-515-7333

Email [hidden email]

________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Advantages of cluster analysis over factor analysis?

Steve Peck
In reply to this post by Tanja Gabriele Baudson
it seems to me that...
"FA" and "CA" address completely different questions; that is,
FA assesses the relations among *variables* -- aka "variable-centered" analysis -- and finds fewer vars aka "factors" that explain the relations among the raw variables.
CA assesses the relations among *objects* (e.g., people)  -- aka "person-centered" analysis -- and finds relatively homogeneous sugroups of objects (defined by similar patterns of values on the given cluster variables).
In practice, people generally first use FA to find a reduced number of variables and then CA to find subgroups based on this reduced number of variables/factors.
   (even though it is usually better to select your best raw variables for CA, unless the FA reveals highly reliable and uni-dimensional factors that include everything you want to know about subgroups)
Any reviewer who suggests using FA instead of CA does not understand CA.


On 8/20/2010 11:47 AM, Tanya wrote:
Hi,

I submitted a paper where I had used a cluster analysis (Ward's method and
k-means) to structure a questionnaire. However, this was rejected by one
reviewer who did not find my approach convincing (factor analysis is more
common indeed). The journal addresses practitioners, therefore I preferred
the clear-cut 7-cluster solution of the cluster analysis to the 17-factor
result from the cluster analysis with strong crossloadings (the
questionnaire concerns a certain subgroup of students, therefore this might
probably be expected; the many mini factors consisting of two or three items
only suck, though, and eliminating them might shorten the questionnaire
significantly ...).

So, are there any arguments about the statistical advantages of CA over FA
which might help convince the reviewer? :)

Thanks in advance
Tanya

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



  

--
Stephen C. Peck
Research Investigator
Achievement Research Lab
Research Center for Group Dynamics
Institute for Social Research
University of Michigan
426 Thompson Street, # 5136
Ann Arbor, MI  48106-1248
(734) 647-3683; fax (734) 936-7370
http://www.rcgd.isr.umich.edu/garp/
[hidden email] 
Reply | Threaded
Open this post in threaded view
|

Re: Advantages of cluster analysis over factor analysis?

E. Bernardo
Dear Steve et al,
 you wrote:
(even though it is usually better to select your best raw variables for CA, unless the FA reveals highly reliable and uni-dimensional factors that include everything you want to know about subgroups)

In FA, when can a unidimensional factor be considered reliable factor?  Are you referring to a high cronbach alpha?

Eins

--- On Sun, 8/22/10, Steve Peck <[hidden email]> wrote:

From: Steve Peck <[hidden email]>
Subject: Re: Advantages of cluster analysis over factor analysis?
To: [hidden email]
Date: Sunday, 22 August, 2010, 6:48 PM

it seems to me that...
"FA" and "CA" address completely different questions; that is,
FA assesses the relations among *variables* -- aka "variable-centered" analysis -- and finds fewer vars aka "factors" that explain the relations among the raw variables.
CA assesses the relations among *objects* (e.g., people)  -- aka "person-centered" analysis -- and finds relatively homogeneous sugroups of objects (defined by similar patterns of values on the given cluster variables).
In practice, people generally first use FA to find a reduced number of variables and then CA to find subgroups based on this reduced number of variables/factors.
   (even though it is usually better to select your best raw variables for CA, unless the FA reveals highly reliable and uni-dimensional factors that include everything you want to know about subgroups)
Any reviewer who suggests using FA instead of CA does not understand CA.


On 8/20/2010 11:47 AM, Tanya wrote:
Hi,

I submitted a paper where I had used a cluster analysis (Ward's method and
k-means) to structure a questionnaire. However, this was rejected by one
reviewer who did not find my approach convincing (factor analysis is more
common indeed). The journal addresses practitioners, therefore I preferred
the clear-cut 7-cluster solution of the cluster analysis to the 17-factor
result from the cluster analysis with strong crossloadings (the
questionnaire concerns a certain subgroup of students, therefore this might
probably be expected; the many mini factors consisting of two or three items
only suck, though, and eliminating them might shorten the questionnaire
significantly ...).

So, are there any arguments about the statistical advantages of CA over FA
which might help convince the reviewer? :)

Thanks in advance
Tanya

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




--
Stephen C. Peck
Research Investigator
Achievement Research Lab
Research Center for Group Dynamics
Institute for Social Research
University of Michigan
426 Thompson Street, # 5136
Ann Arbor, MI 48106-1248
(734) 647-3683; fax (734) 936-7370
http://www.rcgd.isr.umich.edu/garp/
link@...

Reply | Threaded
Open this post in threaded view
|

Re: Advantages of cluster analysis over factor analysis?

Steve Peck
a high alpha would be one indicator of reliability, yes (but there are other indicators/methods that could be used as well)
I just meant that FA can reveal lots of factors indicated by a few items that might need further item development to reliably measure the construct in question, but when doing CA I'm usually more concerned about whether the items/scales I'm using are valid and relevant indicators of the components/functions of the system I'm studying

On 8/22/2010 10:09 PM, Eins Bernardo wrote:
Dear Steve et al,
 you wrote:
(even though it is usually better to select your best raw variables for CA, unless the FA reveals highly reliable and uni-dimensional factors that include everything you want to know about subgroups)

In FA, when can a unidimensional factor be considered reliable factor?  Are you referring to a high cronbach alpha?

Eins

--- On Sun, 8/22/10, Steve Peck [hidden email] wrote:

From: Steve Peck [hidden email]
Subject: Re: Advantages of cluster analysis over factor analysis?
To: [hidden email]
Date: Sunday, 22 August, 2010, 6:48 PM

it seems to me that...
"FA" and "CA" address completely different questions; that is,
FA assesses the relations among *variables* -- aka "variable-centered" analysis -- and finds fewer vars aka "factors" that explain the relations among the raw variables.
CA assesses the relations among *objects* (e.g., people)  -- aka "person-centered" analysis -- and finds relatively homogeneous sugroups of objects (defined by similar patterns of values on the given cluster variables).
In practice, people generally first use FA to find a reduced number of variables and then CA to find subgroups based on this reduced number of variables/factors.
   (even though it is usually better to select your best raw variables for CA, unless the FA reveals highly reliable and uni-dimensional factors that include everything you want to know about subgroups)
Any reviewer who suggests using FA instead of CA does not understand CA.


On 8/20/2010 11:47 AM, Tanya wrote:
Hi,

I submitted a paper where I had used a cluster analysis (Ward's method and
k-means) to structure a questionnaire. However, this was rejected by one
reviewer who did not find my approach convincing (factor analysis is more
common indeed). The journal addresses practitioners, therefore I preferred
the clear-cut 7-cluster solution of the cluster analysis to the 17-factor
result from the cluster analysis with strong crossloadings (the
questionnaire concerns a certain subgroup of students, therefore this might
probably be expected; the many mini factors consisting of two or three items
only suck, though, and eliminating them might shorten the questionnaire
significantly ...).

So, are there any arguments about the statistical advantages of CA over FA
which might help convince the reviewer? :)

Thanks in advance
Tanya

=====================
To manage your subscription to SPSSX-L, send a
 message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



  

--
Stephen C. Peck
Research Investigator
Achievement Research Lab
Research Center for Group Dynamics
Institute for Social Research
University of Michigan
426 Thompson Street, # 5136
Ann Arbor, MI  48106-1248
(734) 647-3683; fax (734) 936-7370
http://www.rcgd.isr.umich.edu/garp/
link@... 


--
Stephen C. Peck
Research Investigator
Achievement Research Lab
Research Center for Group Dynamics
Institute for Social Research
University of Michigan
426 Thompson Street, # 5136
Ann Arbor, MI  48106-1248
(734) 647-3683; fax (734) 936-7370
http://www.rcgd.isr.umich.edu/garp/
[hidden email]