missing cases in principal component analysis

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

missing cases in principal component analysis

Wincy Chan
Hi all



I wonder how factor analysis in SPSS handles missing cases. I am working on
a set of data with >1000 cases and about 40 variables. All I have to do is
to explore factor structure of these 40 items.



I conducted PCA on these 40 items, and asked SPSS to output a correlation
matrix. My understanding is the default handling of missing case in PCA is
to exclude cases listwise. Please correct me if I am wrong.



FACTOR

  /VARIABLES t137 t138 t139 t140 t141 t142 t143 t144 t145 t146 t147 t148
t149 t150 t151 t152 t153 t154 t155

t156 t157 t158 t159 t160 t161 t162 t163 t164 t165 t166

 t167 t168 t169

t170 t171 t172 t173 t174 t175 t176 t177 t178 t179 t180 t181 t182 t183 t184
t185 t186 t187 t188

  /PRINT INITIAL CORRELATION KMO EXTRACTION ROTATION

  /FORMAT BLANK(.5)

  /CRITERIA factors(3)  ITERATE(25)

  /EXTRACTION PC

  /CRITERIA ITERATE(50)

  /ROTATION VARIMAX

  /METHOD=CORRELATION .





I also generated a correlation matrix from the following syntax; however, I
found the correlations slightly different from what I got form the above.



CORRELATIONS

  /VARIABLES=t137 t138 t139 t140 t141 t142 t143 t144 t145 t146 t147 t148
t149 t150 t151 t152 t153 t154 t155

t156 t157 t158 t159 t160 t161 t162 t163 t164 t165 t166

 t167 t168 t169

t170 t171 t172 t173 t174 t175 t176 t177 t178 t179 t180 t181 t182 t183 t184
t185 t186 t187 t188

  /PRINT=TWOTAIL NOSIG

  /MISSING=listWISE .





Although the differences were small (<0.01 in most cases and ~0.07 in one
case), I was expecting two identical correlation matrices. Could some one
please advise if I had done anything wrong?



Many thanks in advance

Wincy

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: missing cases in principal component analysis

Johnny Amora
You can also change the default to "exclude cases pairwise" or "replace with mean".
 
/MISSING = pairwise.
or
/MISSING=meansub.
 
Johnny
 

--- On Wed, 11/26/08, Wincy Chan <[hidden email]> wrote:

From: Wincy Chan <[hidden email]>
Subject: missing cases in principal component analysis
To: [hidden email]
Date: Wednesday, 26 November, 2008, 1:16 PM

Hi all



I wonder how factor analysis in SPSS handles missing cases. I am working on
a set of data with >1000 cases and about 40 variables. All I have to do is
to explore factor structure of these 40 items.



I conducted PCA on these 40 items, and asked SPSS to output a correlation
matrix. My understanding is the default handling of missing case in PCA is
to exclude cases listwise. Please correct me if I am wrong.



FACTOR

  /VARIABLES t137 t138 t139 t140 t141 t142 t143 t144 t145 t146 t147 t148
t149 t150 t151 t152 t153 t154 t155

t156 t157 t158 t159 t160 t161 t162 t163 t164 t165 t166

 t167 t168 t169

t170 t171 t172 t173 t174 t175 t176 t177 t178 t179 t180 t181 t182 t183 t184
t185 t186 t187 t188

  /PRINT INITIAL CORRELATION KMO EXTRACTION ROTATION

  /FORMAT BLANK(.5)

  /CRITERIA factors(3)  ITERATE(25)

  /EXTRACTION PC

  /CRITERIA ITERATE(50)

  /ROTATION VARIMAX

  /METHOD=CORRELATION .





I also generated a correlation matrix from the following syntax; however, I
found the correlations slightly different from what I got form the above.



CORRELATIONS

  /VARIABLES=t137 t138 t139 t140 t141 t142 t143 t144 t145 t146 t147 t148
t149 t150 t151 t152 t153 t154 t155

t156 t157 t158 t159 t160 t161 t162 t163 t164 t165 t166

 t167 t168 t169

t170 t171 t172 t173 t174 t175 t176 t177 t178 t179 t180 t181 t182 t183 t184
t185 t186 t187 t188

  /PRINT=TWOTAIL NOSIG

  /MISSING=listWISE .





Although the differences were small (<0.01 in most cases and ~0.07 in one
case), I was expecting two identical correlation matrices. Could some one
please advise if I had done anything wrong?



Many thanks in advance

Wincy

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



      Share videos while chatting with your friends on messenger

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD