Factor analysis extraction methods

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Factor analysis extraction methods

mglacy
Greetings,
(I posted this on the newsgroup, but was advised that the list has better traffic
these days, so I'm taking the liberty of reposting my question here.)

I recently had need to get some factor analysis results (loadings and
eigenvalues) to match between SPSS and Stata.  A particular estimation
process I'm interested in stipulates that a factor analysis should be
used for part of the process, and that SPSS's principal axes extraction
(PAF or the old PA2) should be used.

I'm trying to reproduce this analysis in Stata (as well as in SPSS), and
I'm now a bit wary about what both programs are doing.  I know that
terminology can be a bit loose here, which may account for some of the
issues.

Here's what I found that puzzles me:

1.  From the algorithm descriptions, SPSS's principal axes and Stata's
iterated principal factors extractions both seem to iteratively re-
estimate the communalities.  However, I can't get them to produce the
same eigenvalues. Close, but consistently different.

2. SPSS gives the same eigenvalues regardless of what extraction I
use:

* For example, Principal Axes vs. Principal Components.
GET
  FILE='C:\Program Files\IBM\SPSS\Statistics\19\Samples\English
\car_sales.sav'.

FACTOR
  /VARIABLES price engine_s horsepow wheelbas width length curb_wgt
  /MISSING LISTWISE
  /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt
  /PRINT INITIAL EXTRACTION
  /CRITERIA MINEIGEN(1) ITERATE(25)
  /EXTRACTION PAF
  /ROTATION NOROTATE .
*.
FACTOR
  /VARIABLES price engine_s horsepow wheelbas width length curb_wgt
  /MISSING LISTWISE
  /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt
  /PRINT INITIAL EXTRACTION
  /CRITERIA MINEIGEN(1) ITERATE(25)
  /EXTRACTION PC
  /ROTATION NOROTATE
  /METHOD=CORRELATION.
  /METHOD=CORRELATION.

3.  Stata's "principal component factor" approach (communalities fixed
at 1.0) gives the same results as SPSS, which would suggest that SPSS
is *not* doing principal axes, which supposedly iteratively re-
estimates communalities.

Can anyone offer clarification on what's going on here, in particular
what is happening in  SPSS such that changing the method of extraction
does not affect the eigenvalues? (I don't expect anyone here to have
any particular interest in or knowledge about Stata, but I thought
some FA knowledgeable folks might nevertheless be able to shed
light on what's happening.)

Regards,

Mike Lacy
Dept. of Sociology
Colorado State Univ.
Fort Collins CO
Reply | Threaded
Open this post in threaded view
|

Re: Factor analysis extraction methods

Mike
Consider the following points:

(1)  I think there may be some confusion about what Stata
and SPSS are doing.  Let me suggest that you take a look
at the UCLA Stat Computing center and take a look at the
SPSS and Stata factor analysis write-ups which seem to
perform that same analysis (principal axis factor analysis)
on the same dataset (13 items from a survey conducted by
John Sidanius; on the SPSS page there's a link to the
SPSS data file being used).  For the SPSS Factor Analysis
see:
http://www.ats.ucla.edu/stat/spss/output/factor1.htm
For the Stata Factor Analysis, see:
http://www.ats.ucla.edu/stat/stata/output/fa_output.htm
Note that though both analyses extract the same number of
factors, though the eigenvalues appear to be different..

(3)  In your point #2 below you say: "SPSS gives the same
eigenvalues regardless of what extraction I use".  You leave out
a key word, you should say "same INITIAL eignenvalues".
Again, if you go to the UCLA stat center and look at their
SPSS output for principal component analysis (which uses
the same dataset referred to above), you will see that the
initial eigenvalues are the same as those for SPSS PFA; see:
http://www.ats.ucla.edu/stat/SPSS/output/principal_components.htm
The "Extraction Sums of squared Loadings" are different for
SPSS PCA vs. PFA but the Extracted "Total" (new eigenvalues)
are the same as those produced in the Stata output;
Look at column 4 of the "Total Variance Explained" table
in the SPSS PFA and compare it to column 2 in the Stata
output -- the first three values are the same.  Stata provides
all of the eigenvalues after iteration while SPSS provide
the eigenvalues for the number of factors extracted.

(4)  I'd wager that the initial eigenvalues are the same because
both analyses start off using 1.00 on the diagonal.  For the principal
factor analysis this then goes through a process where this is then
iteratively changed as indicated in the ver 18 SPSS Algorithms
manual on page 322.  I think that the eigenvalues you're looking
for are in the TOTAL column under the heading "Extraction Sums
of Squared Loadings".

It's been some time since I've gone through these types of comparisons
and I'll leave to the more knowledgeable folks to point out where I am
wrong.

-Mike Palij
New York University
[hidden email]


----- Original Message -----
From: "mglacy" <[hidden email]>
To: <[hidden email]>
Sent: Friday, July 15, 2011 6:53 PM
Subject: Factor analysis extraction methods


> Greetings,
> (I posted this on the newsgroup, but was advised that the list has better
> traffic
> these days, so I'm taking the liberty of reposting my question here.)
>
> I recently had need to get some factor analysis results (loadings and
> eigenvalues) to match between SPSS and Stata.  A particular estimation
> process I'm interested in stipulates that a factor analysis should be
> used for part of the process, and that SPSS's principal axes extraction
> (PAF or the old PA2) should be used.
>
> I'm trying to reproduce this analysis in Stata (as well as in SPSS), and
> I'm now a bit wary about what both programs are doing.  I know that
> terminology can be a bit loose here, which may account for some of the
> issues.
>
> Here's what I found that puzzles me:
>
> 1.  From the algorithm descriptions, SPSS's principal axes and Stata's
> iterated principal factors extractions both seem to iteratively re-
> estimate the communalities.  However, I can't get them to produce the
> same eigenvalues. Close, but consistently different.
>
> 2. SPSS gives the same eigenvalues regardless of what extraction I
> use:
>
> * For example, Principal Axes vs. Principal Components.
> GET
>  FILE='C:\Program Files\IBM\SPSS\Statistics\19\Samples\English
> \car_sales.sav'.
>
> FACTOR
>  /VARIABLES price engine_s horsepow wheelbas width length curb_wgt
>  /MISSING LISTWISE
>  /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt
>  /PRINT INITIAL EXTRACTION
>  /CRITERIA MINEIGEN(1) ITERATE(25)
>  /EXTRACTION PAF
>  /ROTATION NOROTATE .
> *.
> FACTOR
>  /VARIABLES price engine_s horsepow wheelbas width length curb_wgt
>  /MISSING LISTWISE
>  /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt
>  /PRINT INITIAL EXTRACTION
>  /CRITERIA MINEIGEN(1) ITERATE(25)
>  /EXTRACTION PC
>  /ROTATION NOROTATE
>  /METHOD=CORRELATION.
>  /METHOD=CORRELATION.
>
> 3.  Stata's "principal component factor" approach (communalities fixed
> at 1.0) gives the same results as SPSS, which would suggest that SPSS
> is *not* doing principal axes, which supposedly iteratively re-
> estimates communalities.
>
> Can anyone offer clarification on what's going on here, in particular
> what is happening in  SPSS such that changing the method of extraction
> does not affect the eigenvalues? (I don't expect anyone here to have
> any particular interest in or knowledge about Stata, but I thought
> some FA knowledgeable folks might nevertheless be able to shed
> light on what's happening.)
>
> Regards,
>
> Mike Lacy
> Dept. of Sociology
> Colorado State Univ.
> Fort Collins CO
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Factor-analysis-extraction-methods-tp4592615p4592615.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Factor analysis extraction methods

mglacy

<quote author="Mike Palij">

>(3)  In your point #2 below you say: "SPSS gives the same
>eigenvalues regardless of what extraction I use".  You leave out
>a key word, you should say "same INITIAL eignenvalues".

This seems exactly correct.  I was simply confused because FACTOR reports
the initial eigenvalues, before whatever iterative extraction (PAF, ML, etc.)
happens.  What some other programs report as the eigenvalues (after
extraction) are what SPSS labels as "Extraction Sum of Squared Loadings."
So, for example, the appropriate option in Stata will produce eigenvalues
that exactly match up with FACTOR using PAF.

Mystery is solved.

Thanks,
Mike Lacy
Dept. of Sociology
Colorado State University