Posted by
Hector Maletta on
Jul 06, 2010; 9:23pm
URL: http://spssx-discussion.165.s1.nabble.com/factor-analysis-on-dichotomous-data-tp1081162p1081165.html
Dichotomous variables represent linear relationships. It cam be proved that the percentage difference of the dep variable between the two categories of another dichotomous variable is algebraically equivalent to a linear regression coefficient, and the phi coefficient is also equivalent to the linear correlation coefficient for two variables with only two values each. Don't have the reference at hand but they are solid results known from years.
Hector
-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:
[hidden email]] En nombre de Kaat
Enviado el: Tuesday, July 06, 2010 5:06 PM
Para:
[hidden email]
Asunto: Re: factor analysis on dichotomous data
Thanks a lot for your clear explanation.
I now understand why I can consider my data as interval data. However,
another assumption of factor analysis is that that the variables are
linearly related. My yes-no codes do not represent an underlying linear
dimension but simply indicate whether a certain belief is expressed or not.
Can my variables be linearly related then? I know that Phi point
correlations can be calculated and that they are similar to Pearson
correlations, but does this also imply that the relationship between the
variables is considered a linear relationship?
Sorry to insist on this, but I am working on a paper and these are comments
I got when presenting this data.
Kaat
Hector Maletta wrote:
>
> Kaat,
> Dichotomous variables can be legitimately seen as internal variables. The
> trouble with categorical variables such as nationality or ethnicity
> emerges
> from the fact that the intervals or real differences between categories
> cannot be given a reasonable value. But with dichotomous variables the
> problem does not exist: you have a variable with only two possible values,
> therefore one possible interval. Define that interval (difference between
> Yes and No) as your unit of measurement, such that passing from No to Yes
> is
> a unit increment. Since this only interval needs not be compared no
> anything
> else, you do not have any ambiguity.
> CATPCA es for variables with 3 or more categories (ordered or unordered).
> For dichotomies, it gives the same solution of tradicional PSA. Thus your
> solution is OK.
>
> Now, suppose you start with multi-categorial questions, and reduce them to
> a
> series of dummies. In this case, even if the original variables where
> strongly inter-correlated, you are generated some correlations that are
> necessarily negative (for instance, wooden roofs will be negatively
> correlated with tile roofs, because one excludes the other, thus producing
> a
> lot of negative correlation coefficients in an extended matriz of
> categories
> for all variables, even if the original variables WERE positively
> correlated
> (in the sense that poor walls were correlated with poor roofs and poor
> sanitary services). This distorts the results of factor analysis,
> including
> reducing the variance explained by the first factor.
>
> Thus, CATCA should be used for problems involving multi-category
> variables,
> and classical factor analysis for the rest (including authentic binary
> variables such as Yes-No questions).
> Beware than unlike classic FACTOR command, the CATPCA command requires
> holding the entire dataset in memory, thus greately limiting the number of
> cases and variables you can process. Perhaps using a stripped down
> database
> with kist the variables you actually need may allow CATCPA to work, but
> with
> large dataset it doesn't work.
>
> Hector
>
> -----Mensaje original-----
> De: SPSSX(r) Discussion [mailto:
[hidden email]] En nombre de
> Kaat
> Enviado el: Tuesday, July 06, 2010 2:05 PM
> Para:
[hidden email]
> Asunto: factor analysis on dichotomous data
>
> Hi
>
> I am struggling with a PCA on dichotmous data (1-0).
> My data stem from a content analysis of 726 paragraphs. For each paragrpah
> the presence for each of 18 codes was indicated. I am looking for racial
> idoelogies and am only interested in the relationship between the codes.
> I performed a PCA on the paragrpahs-codes matrix (with varimax) and got a
> nicely interpretable solution.
> Now, I am unsure if I can report on this PCA results because my data are
> categorical. I tried to do CATPAC and but I don't know if you can rotate
> the
> components in a way similar to VARIMAX and if you can save component
> scores.
>
> thanks!
> Kaat
>
> --
> View this message in context:
>
http://old.nabble.com/factor-analysis-on-dichotomous-data-tp29086718p2908671> 8.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
>
[hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
> Se certificó que el correo entrante no contiene virus.
> Comprobada por AVG - www.avg.es
> Versión: 8.5.439 / Base de datos de virus: 271.1.1/2984 - Fecha de la
> versión: 07/06/10 06:36:00
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
>
[hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
--
View this message in context:
http://old.nabble.com/factor-analysis-on-dichotomous-data-tp29086718p29089587.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Se certificó que el correo entrante no contiene virus.
Comprobada por AVG - www.avg.es
Versión: 8.5.439 / Base de datos de virus: 271.1.1/2984 - Fecha de la versión: 07/06/10 18:36:00
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD