Hello,
Eurobarometer 66.1 provides data on social values which I would like to use, with other influences, to explain church going. The item battery of social values provides 12 questions with yes/no answer alternatives. The respondent can choose up to three variables. What I need is a procedure like a PCA for dichotomous data, but I don't have access to CATPCA. I calculated proximities with the dice algorithm to correct for the high probability that none of two items will be selected. I used PROXIMITIES to calculated the similarity of variables. PROXIMITIES v327 to v338 /VIEW=VARIABLE /MEASURE= dice (1,0) . Once PROXIMITIES produces the matrix can you input this as a correlation matrix into FACTOR ? And how to move from this variable-based analysis back to the case-based analysis ? Is there a better alternative for getting a variable structure from dichotomous variables ? TIA, F. Thomas ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
If the data are dichotomous, conventional PCA (SPSS FACTOR procedure) is
exactly the same as categorical PCA (SPSS CATPCA procedure). The latter is required when the original data are multi-categorical variables (either nominal or ordinal), in order to generate (iteratively) optimal scaling values for the categories and a Principal Component Analysis of the resulting (interval level) variables. I wonder whether the fact that each respondent may choose up to three dichotomous variables has any influence on this. It depends, I surmise, on the way you want to treat those data. (a) you may treat each CHOICE as one case. In this fashion, there would be one case (one row in the dataset) for each combination of respondent and choice, with up to three (but not necessarily three) choices per respondent. In this case, my above advice works, although its analysis may require a two-level model to distinguish between intra- and inter- respondent effects. (b) you may treat each RESPONDENT as a case. In this option, you may have different COMBINATIONS of responses per respondent. The maximum number (all combinations of three out of 12) is probably much higher than the number of respondents in your sample, and thus only a small proportion of all combinations will show up. These observed combinations may be treated as a NOMINAL multy-category variable, with many values. For this kind of approach CATPCA would be appropriate, but I caution that the number of distinct combinations observed must not be large (with N respondents and M observed combinations, you have N-M-1 degrees of freedom, which may result in a fairly low number, thus invalidating the results in statistical terms.) If only a few response patterns are observed, and the number of respondents is comparatively very large, you'd be OK, but beware of too many choices and too few subjects. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de ftr Enviado el: Friday, July 08, 2011 11:13 Para: [hidden email] Asunto: PCA for dichotomous data Hello, Eurobarometer 66.1 provides data on social values which I would like to use, with other influences, to explain church going. The item battery of social values provides 12 questions with yes/no answer alternatives. The respondent can choose up to three variables. What I need is a procedure like a PCA for dichotomous data, but I don't have access to CATPCA. I calculated proximities with the dice algorithm to correct for the high probability that none of two items will be selected. I used PROXIMITIES to calculated the similarity of variables. PROXIMITIES v327 to v338 /VIEW=VARIABLE /MEASURE= dice (1,0) . Once PROXIMITIES produces the matrix can you input this as a correlation matrix into FACTOR ? And how to move from this variable-based analysis back to the case-based analysis ? Is there a better alternative for getting a variable structure from dichotomous variables ? TIA, F. Thomas ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ----- No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1388 / Virus Database: 1516/3751 - Release Date: 07/08/11 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thank your Hector, for your answer.
the fact that respondents have only three choices for 12 items produces lots of cases with zero as entry. But the respondents did not say No, they just said nothing. It's a a sort of logic missing. Nevertheless, in a first rush I tried a PCA with the dichotomous variables but the large number of zero entries violated the conditions, of course (and killed the procedure). In any case, CATPCA is not a solution as I don't have access to this module. As my ultimate intention is to explain church going (regular church goer vs. all the rest) I currently work with a discriminant analysis with the original items - without having them factor analysed before. On 08/07/2011 17:15, Hector Maletta wrote: > If the data are dichotomous, conventional PCA (SPSS FACTOR procedure) is > exactly the same as categorical PCA (SPSS CATPCA procedure). The latter is > required when the original data are multi-categorical variables (either > nominal or ordinal), in order to generate (iteratively) optimal scaling > values for the categories and a Principal Component Analysis of the > resulting (interval level) variables. > > I wonder whether the fact that each respondent may choose up to three > dichotomous variables has any influence on this. It depends, I surmise, on > the way you want to treat those data. > (a) you may treat each CHOICE as one case. In this fashion, there would be > one case (one row in the dataset) for each combination of respondent and > choice, with up to three (but not necessarily three) choices per respondent. > In this case, my above advice works, although its analysis may require a > two-level model to distinguish between intra- and inter- respondent effects. > (b) you may treat each RESPONDENT as a case. In this option, you may have > different COMBINATIONS of responses per respondent. The maximum number (all > combinations of three out of 12) is probably much higher than the number of > respondents in your sample, and thus only a small proportion of all > combinations will show up. These observed combinations may be treated as a > NOMINAL multy-category variable, with many values. For this kind of approach > CATPCA would be appropriate, but I caution that the number of distinct > combinations observed must not be large (with N respondents and M observed > combinations, you have N-M-1 degrees of freedom, which may result in a > fairly low number, thus invalidating the results in statistical terms.) If > only a few response patterns are observed, and the number of respondents is > comparatively very large, you'd be OK, but beware of too many choices and > too few subjects. > > Hector > > -----Mensaje original----- > De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de ftr > Enviado el: Friday, July 08, 2011 11:13 > Para: [hidden email] > Asunto: PCA for dichotomous data > > Hello, > > Eurobarometer 66.1 provides data on social values which I would like to > use, with other influences, to explain church going. > The item battery of social values provides 12 questions with yes/no > answer alternatives. The respondent can choose up to three variables. > > What I need is a procedure like a PCA for dichotomous data, but I don't > have access to CATPCA. I calculated proximities with the dice algorithm > to correct for the high probability that none of two items will be > selected. I used PROXIMITIES to calculated the similarity of variables. > > PROXIMITIES v327 to v338 > /VIEW=VARIABLE > /MEASURE= dice (1,0) . > > Once PROXIMITIES produces the matrix can you input this as a correlation > matrix into FACTOR ? And how to move from this variable-based analysis > back to the case-based analysis ? > > Is there a better alternative for getting a variable structure from > dichotomous variables ? > > TIA, > F. Thomas > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 10.0.1388 / Virus Database: 1516/3751 - Release Date: 07/08/11 > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
You could use information obtained from a Rasch model to assess
dimensionality. It should be possible to fit a Rasch model on binary variables (e.g., yes/no survey items) via the GENLINMIXED procedure in SPSS 19. This procedure does not require that you have responses to all items from each person, but missing data are assumed to be missing at random (MAR). Whether the MAR assumption is tenable for your data is unclear to me. Ryan On Fri, Jul 8, 2011 at 12:53 PM, ftr <[hidden email]> wrote: > Thank your Hector, for your answer. > > the fact that respondents have only three choices for 12 items produces > lots of cases with zero as entry. But the respondents did not say No, > they just said nothing. It's a a sort of logic missing. Nevertheless, in > a first rush I tried a PCA with the dichotomous variables but the large > number of zero entries violated the conditions, of course (and killed > the procedure). > > In any case, CATPCA is not a solution as I don't have access to this > module. > > As my ultimate intention is to explain church going (regular church goer > vs. all the rest) I currently work with a discriminant analysis with the > original items - without having them factor analysed before. > > On 08/07/2011 17:15, Hector Maletta wrote: >> >> If the data are dichotomous, conventional PCA (SPSS FACTOR procedure) is >> exactly the same as categorical PCA (SPSS CATPCA procedure). The latter is >> required when the original data are multi-categorical variables (either >> nominal or ordinal), in order to generate (iteratively) optimal scaling >> values for the categories and a Principal Component Analysis of the >> resulting (interval level) variables. >> >> I wonder whether the fact that each respondent may choose up to three >> dichotomous variables has any influence on this. It depends, I surmise, on >> the way you want to treat those data. >> (a) you may treat each CHOICE as one case. In this fashion, there would be >> one case (one row in the dataset) for each combination of respondent and >> choice, with up to three (but not necessarily three) choices per >> respondent. >> In this case, my above advice works, although its analysis may require a >> two-level model to distinguish between intra- and inter- respondent >> effects. >> (b) you may treat each RESPONDENT as a case. In this option, you may have >> different COMBINATIONS of responses per respondent. The maximum number >> (all >> combinations of three out of 12) is probably much higher than the number >> of >> respondents in your sample, and thus only a small proportion of all >> combinations will show up. These observed combinations may be treated as a >> NOMINAL multy-category variable, with many values. For this kind of >> approach >> CATPCA would be appropriate, but I caution that the number of distinct >> combinations observed must not be large (with N respondents and M observed >> combinations, you have N-M-1 degrees of freedom, which may result in a >> fairly low number, thus invalidating the results in statistical terms.) If >> only a few response patterns are observed, and the number of respondents >> is >> comparatively very large, you'd be OK, but beware of too many choices and >> too few subjects. >> >> Hector >> >> -----Mensaje original----- >> De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de ftr >> Enviado el: Friday, July 08, 2011 11:13 >> Para: [hidden email] >> Asunto: PCA for dichotomous data >> >> Hello, >> >> Eurobarometer 66.1 provides data on social values which I would like to >> use, with other influences, to explain church going. >> The item battery of social values provides 12 questions with yes/no >> answer alternatives. The respondent can choose up to three variables. >> >> What I need is a procedure like a PCA for dichotomous data, but I don't >> have access to CATPCA. I calculated proximities with the dice algorithm >> to correct for the high probability that none of two items will be >> selected. I used PROXIMITIES to calculated the similarity of variables. >> >> PROXIMITIES v327 to v338 >> /VIEW=VARIABLE >> /MEASURE= dice (1,0) . >> >> Once PROXIMITIES produces the matrix can you input this as a correlation >> matrix into FACTOR ? And how to move from this variable-based analysis >> back to the case-based analysis ? >> >> Is there a better alternative for getting a variable structure from >> dichotomous variables ? >> >> TIA, >> F. Thomas >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> ----- >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 10.0.1388 / Virus Database: 1516/3751 - Release Date: 07/08/11 >> >> > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |