Login  Register

Re: factor analysis on dichotomous data

Posted by Kaat on Jul 06, 2010; 8:05pm
URL: http://spssx-discussion.165.s1.nabble.com/factor-analysis-on-dichotomous-data-tp1081162p1081164.html

Thanks a lot for your clear explanation.
I now understand why I can consider my data as interval data. However, another assumption of factor analysis is that that the variables are linearly related. My yes-no codes do not represent an underlying linear dimension but simply indicate whether a certain belief is expressed or not.
Can my variables be linearly related then? I know that Phi point correlations can be calculated and that they are similar to Pearson correlations, but does this also imply that the relationship between the variables is considered a linear relationship?

Sorry to insist on this, but I am working on a paper and these are comments I got when presenting this data.

Kaat  

Hector Maletta wrote
Kaat,
Dichotomous variables can be legitimately seen as internal variables. The
trouble with categorical variables such as nationality or ethnicity emerges
from the fact that the intervals or real differences between categories
cannot be given a reasonable value. But with dichotomous variables the
problem does not exist: you have a variable with only two possible values,
therefore one possible interval. Define that interval (difference between
Yes and No) as your unit of measurement, such that passing from No to Yes is
a unit increment. Since this only interval needs not be compared no anything
else, you do not have any ambiguity.
CATPCA es for variables with 3 or more categories (ordered or unordered).
For dichotomies, it gives the same solution of tradicional PSA. Thus your
solution is OK.

Now, suppose you start with multi-categorial questions, and reduce them to a
series of dummies. In this case, even if the original variables where
strongly inter-correlated, you are generated some correlations that are
necessarily negative (for instance, wooden roofs will be negatively
correlated with tile roofs, because one excludes the other, thus producing a
lot of negative correlation coefficients in an extended matriz of categories
for all variables, even if the original variables WERE positively correlated
(in the sense that poor walls were correlated with poor roofs and poor
sanitary services). This distorts the results of factor analysis, including
reducing the variance explained by the first factor.

Thus, CATCA should be used for problems involving multi-category variables,
and classical factor analysis for the rest (including authentic binary
variables such as Yes-No questions).
Beware than unlike classic FACTOR command, the CATPCA command requires
holding the entire dataset in memory, thus greately limiting the number of
cases and variables you can process. Perhaps using a stripped down database
with kist the variables you actually need may allow CATCPA to work, but with
large dataset it doesn't work.

Hector

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] En nombre de Kaat
Enviado el: Tuesday, July 06, 2010 2:05 PM
Para: SPSSX-L@LISTSERV.UGA.EDU
Asunto: factor analysis on dichotomous data

Hi

I am struggling with a PCA on dichotmous data (1-0).
My data stem from a content analysis of 726 paragraphs. For each paragrpah
the presence for each of 18 codes was indicated. I am looking for racial
idoelogies and am only interested in the relationship between the codes.
I performed a PCA on the paragrpahs-codes matrix (with varimax) and got a
nicely interpretable solution.
Now, I am unsure if I can report on this PCA results because my data are
categorical. I tried to do CATPAC and but I don't know if you can rotate the
components in a way similar to VARIMAX and if you can save component scores.

thanks!
Kaat

--
View this message in context:
http://old.nabble.com/factor-analysis-on-dichotomous-data-tp29086718p2908671
8.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Se certificó que el correo entrante no contiene virus.
Comprobada por AVG - www.avg.es
Versión: 8.5.439 / Base de datos de virus: 271.1.1/2984 - Fecha de la
versión: 07/06/10 06:36:00

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD