Login  Register

Re: PCA for dichotomous data

Posted by news on Jul 08, 2011; 4:53pm
URL: http://spssx-discussion.165.s1.nabble.com/PCA-for-dichotomous-data-tp4564908p4565519.html

Thank your Hector, for your answer.

the fact that respondents have only three choices for 12 items produces
lots of cases with zero as entry. But the respondents did not say No,
they just said nothing. It's a a sort of logic missing. Nevertheless, in
a first rush I tried a PCA with the dichotomous variables but the large
number of zero entries violated the conditions, of course (and killed
the procedure).

In any case, CATPCA is not a solution as I don't have access to this
module.

As my ultimate intention is to explain church going (regular church goer
vs. all the rest) I currently work with a discriminant analysis with the
original items - without having them factor analysed before.

On 08/07/2011 17:15, Hector Maletta wrote:

> If the data are dichotomous, conventional PCA (SPSS FACTOR procedure) is
> exactly the same as categorical PCA (SPSS CATPCA procedure). The latter is
> required when the original data are multi-categorical variables (either
> nominal or ordinal), in order to generate (iteratively) optimal scaling
> values for the categories and a Principal Component Analysis of the
> resulting (interval level) variables.
>
> I wonder whether the fact that each respondent may choose up to three
> dichotomous variables has any influence on this. It depends, I surmise, on
> the way you want to treat those data.
> (a) you may treat each CHOICE as one case. In this fashion, there would be
> one case (one row in the dataset) for each combination of respondent and
> choice, with up to three (but not necessarily three) choices per respondent.
> In this case, my above advice works, although its analysis may require a
> two-level model to distinguish between intra- and inter- respondent effects.
> (b) you may treat each RESPONDENT as a case. In this option, you may have
> different COMBINATIONS of responses per respondent. The maximum number (all
> combinations of three out of 12) is probably much higher than the number of
> respondents in your sample, and thus only a small proportion of all
> combinations will show up. These observed combinations may be treated as a
> NOMINAL multy-category variable, with many values. For this kind of approach
> CATPCA would be appropriate, but I caution that the number of distinct
> combinations observed must not be large (with N respondents and M observed
> combinations, you have N-M-1 degrees of freedom, which may result in a
> fairly low number, thus invalidating the results in statistical terms.) If
> only a few response patterns are observed, and the number of respondents is
> comparatively very large, you'd be OK, but beware of too many choices and
> too few subjects.
>
> Hector
>
> -----Mensaje original-----
> De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de ftr
> Enviado el: Friday, July 08, 2011 11:13
> Para: [hidden email]
> Asunto: PCA for dichotomous data
>
> Hello,
>
> Eurobarometer 66.1 provides data on social values which I would like to
> use, with other influences, to explain church going.
> The item battery of social values provides 12 questions with yes/no
> answer alternatives. The respondent can choose up to three variables.
>
> What I need is a procedure like a PCA for dichotomous data, but I don't
> have access to CATPCA. I calculated proximities with the dice algorithm
> to correct for the high probability that none of two items will be
> selected. I used  PROXIMITIES to calculated the similarity of variables.
>
> PROXIMITIES v327 to v338
> /VIEW=VARIABLE
> /MEASURE= dice (1,0) .
>
> Once PROXIMITIES produces the matrix can you input this as a correlation
> matrix into FACTOR ? And how to move from this variable-based analysis
> back to the case-based analysis ?
>
> Is there a better alternative for getting a variable structure from
> dichotomous variables ?
>
> TIA,
> F. Thomas
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1388 / Virus Database: 1516/3751 - Release Date: 07/08/11
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD