Posted by
Hector Maletta on
Jul 08, 2011; 3:15pm
URL: http://spssx-discussion.165.s1.nabble.com/PCA-for-dichotomous-data-tp4564908p4565133.html
If the data are dichotomous, conventional PCA (SPSS FACTOR procedure) is
exactly the same as categorical PCA (SPSS CATPCA procedure). The latter is
required when the original data are multi-categorical variables (either
nominal or ordinal), in order to generate (iteratively) optimal scaling
values for the categories and a Principal Component Analysis of the
resulting (interval level) variables.
I wonder whether the fact that each respondent may choose up to three
dichotomous variables has any influence on this. It depends, I surmise, on
the way you want to treat those data.
(a) you may treat each CHOICE as one case. In this fashion, there would be
one case (one row in the dataset) for each combination of respondent and
choice, with up to three (but not necessarily three) choices per respondent.
In this case, my above advice works, although its analysis may require a
two-level model to distinguish between intra- and inter- respondent effects.
(b) you may treat each RESPONDENT as a case. In this option, you may have
different COMBINATIONS of responses per respondent. The maximum number (all
combinations of three out of 12) is probably much higher than the number of
respondents in your sample, and thus only a small proportion of all
combinations will show up. These observed combinations may be treated as a
NOMINAL multy-category variable, with many values. For this kind of approach
CATPCA would be appropriate, but I caution that the number of distinct
combinations observed must not be large (with N respondents and M observed
combinations, you have N-M-1 degrees of freedom, which may result in a
fairly low number, thus invalidating the results in statistical terms.) If
only a few response patterns are observed, and the number of respondents is
comparatively very large, you'd be OK, but beware of too many choices and
too few subjects.
Hector
-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:
[hidden email]] En nombre de ftr
Enviado el: Friday, July 08, 2011 11:13
Para:
[hidden email]
Asunto: PCA for dichotomous data
Hello,
Eurobarometer 66.1 provides data on social values which I would like to
use, with other influences, to explain church going.
The item battery of social values provides 12 questions with yes/no
answer alternatives. The respondent can choose up to three variables.
What I need is a procedure like a PCA for dichotomous data, but I don't
have access to CATPCA. I calculated proximities with the dice algorithm
to correct for the high probability that none of two items will be
selected. I used PROXIMITIES to calculated the similarity of variables.
PROXIMITIES v327 to v338
/VIEW=VARIABLE
/MEASURE= dice (1,0) .
Once PROXIMITIES produces the matrix can you input this as a correlation
matrix into FACTOR ? And how to move from this variable-based analysis
back to the case-based analysis ?
Is there a better alternative for getting a variable structure from
dichotomous variables ?
TIA,
F. Thomas
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1388 / Virus Database: 1516/3751 - Release Date: 07/08/11
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD