SPSSX Discussion

eigenvalues as guide to the number of factors

Classic

List

Threaded

1 message

Anthony Babinec

eigenvalues as guide to the number of factors

Here is something Professor Stan Mulaik posted on SEMNET awhile ago.

I wonder whether this relates to Kathryn Gardners question? Mulaik

refers to ML factor analysis, so I suppose PAF should be viewed

as an approximation to ML.

++++++++++++++++++++++++++++++++++++++++++++++++

From: Stanley Mulaik

Subject: SPSS and factor analysis

I have for several years sought to get SPSS to report the eigenvalues of the
matrix S^-1RS^-1 instead of the eigenvalues of R, where S^2 = [diag
R^-1]^-1. S^2 is a first approximation to the unique variance matrix (a
diagonal matrix). But SPSS just reports the eigenvalues of the principal
components analysis, even when it goes on to do the maximum likelihood
analysis, which begins with the matrix S^-1RS^-1.

Why do we need the eigenvalues of S^-1RS^-1?

S^2 approximates U^2, the diagonal matrix of unique variances, which we
normally do not know when we begin the exploratory ML factor analysis.

But Jöreskog showed that U^-1(R - U^2)U^-1 = U^-1RU^-1 - I. If U^2 is the
correct unique factor matrix, then (R - U^2) will have rank equal to the
number of common factors and be a Gramian matrix.

The rank of U^-1(R - U^2)U^-1 should be the rank of (R - U^2), since the
rescaling by a positive definite diagonal matrix will not change the rank of
the resulting matrix.

Hence all eigenvalues greater than 1.00 of U^-1RU^-1 will correspond to
eigenvalues greater than zero of U^-1(R - U^2)U^-1 = U^-1RU^-1 - I. I wish
SPSS would report the eigenvalues of U^-1RU^-1 when it does a ML factor
analysis. (You can get these using SPSS MATRIX commands using the reported
estimates for the communalities and subtracting them from 1.00 to obtain
estimates of U^2.). There should be a point where most of the remaining
eigenvalues are close to 1.00. If not, then you don't have the right number
for the common factors. Try another number for the number of common factors
and repeat ML again. The solution that produces eigenvalues after the last
eigenvalue greater than 1.00 all very close to 1.00, should be based on the
appropriate number of common factors.

So, if you knew the correct unique factor variance matrix for the minimum
rank solution, you would be able to find the number of common factors
(assuming the data was generated by a common factor model).

Since we don't, we begin with approximations to U^2, S^2 = [diagR^-1]^-1,
where S^2 contains the errors of estimate for estimating each variable from
the p-1 other variables in the correlation matrix by multiple regression.
S^2 is a strong upper bound estimator of U^2, meaning it will be generally
close (if you have lots of variables in R).

So, find the number of eigenvalues of the matrix S^-1RS^-1 greater than
1.00. This will be close to the number of common factors. Usually this
number is much larger than the number of eigenvalues greater than 1.00 of R
(obtained by principal components analysis of R). The latter is the number
of positive eigenvalues of the matrix (R - I), which is like analyzing the
correlation matrix by replacing each of the 1.00's on the principal diagonal
with 0's. This gives a lower bound estimate to the number of common factors
(according to Guttman).

Use the number of eigenvalues greater than 1.00 of S^-1RS^-1 as the number
of factors that SPSS ML must base its analysis on. (That algorithm always
needs a specified number of common factors to obtain its estimate of the
communalities).

You can get the number of eigenvalues greater than 1.00 of S^-1RS^-1 using
the MATRIX - END MATRIX language in SPSS. (That's not hard to do, if you
study this programming language in the SPSS manual).

Looking at the plot of the eigenvalues of S^-1RS^-1 versus the ordinal
number of the eigenvalue may also suggest where a good number of common
factors to retain will be found (by the scree criterion).

The biggest problem with doing exploratory factor analysis is knowing a
priori that the data was generated by such a model and not by some other
causal structure. To arrive at such a decision usually requires so much
substantive theorizing about your data that you are better off then doing a
confirmatory factor analysis instead of exploratory factor analysis.

Stan Mulaik

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD