|
Dear list mates,
I have derived six factors from an exploratory analysis using one sample. I want to compare the 6 factor scores from the sample they were derived from to another sample. Is there a way to do this? Since I selected cases when running the factor anlysis, the factor scores are not computed for the remaining cases (my other sample). thank you for your help. Kevin |
|
This can be done with CATPCA (Categories package), which is equal to PCA
if you choose numeric scaling level for all variables. You can specify the unselected cases as supplementary cases. The solution will be computed for the selected cases and the supplementary cases will be fitted into this solution. Anita van der Kooij Data Theory Group Leiden University -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of KEVIN MANNING Sent: 31 May 2007 14:35 To: [hidden email] Subject: Factor scores applied to a diffrent sample Dear list mates, I have derived six factors from an exploratory analysis using one sample. I want to compare the 6 factor scores from the sample they were derived from to another sample. Is there a way to do this? Since I selected cases when running the factor anlysis, the factor scores are not computed for the remaining cases (my other sample). thank you for your help. Kevin ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** |
|
In addition to Anita's suggestion, it can also be done with
ordinary factor analysis. Factor scores are linear combinations of observed variables, weighted by certain coefficients. The FACTOR command in SPSS produces a table with the component score coefficient matrix, i.e. the coefficients or weights to apply to the observed variables (standardized in z-score form) in order to obtain the factor scores of each case for the various factors. You can use the coefficients to compute factor scores for new cases by way of the COMPUTE command. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Kooij, A.J. van der Enviado el: 31 May 2007 09:50 Para: [hidden email] Asunto: Re: Factor scores applied to a diffrent sample This can be done with CATPCA (Categories package), which is equal to PCA if you choose numeric scaling level for all variables. You can specify the unselected cases as supplementary cases. The solution will be computed for the selected cases and the supplementary cases will be fitted into this solution. Anita van der Kooij Data Theory Group Leiden University -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of KEVIN MANNING Sent: 31 May 2007 14:35 To: [hidden email] Subject: Factor scores applied to a diffrent sample Dear list mates, I have derived six factors from an exploratory analysis using one sample. I want to compare the 6 factor scores from the sample they were derived from to another sample. Is there a way to do this? Since I selected cases when running the factor anlysis, the factor scores are not computed for the remaining cases (my other sample). thank you for your help. Kevin ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** |
|
Actually, there is a bit more to it when computing factor scores
yourself for cases that were not in the analysis: The standardization of the variables should be done on the cases included in analysis. For the cases not in analysis you should substract mean and divide by standard deviation, using the mean and std. dev. for the cases in analysis. Then compute the factor score as by multiplying these values with the loadings, NB: the loadings are found in the Component Matrix table, NOT in the Factor Score Coefficient Matrix table. Factor scores are computed as the sum of the standardized variables multiplied with loadings, AND the result is standardized. So, to know the mean and std. dev. with which to standardize the factor scores for the cases not in the analysis, you have to compute the "raw" factor scores yourself (i.e., scores before standardization) also for the cases included in the analysis (you can standardize them using Descriptives and in Factor menu choose Scores, Save as variables, to check your computations). Regards, Anita -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 31 May 2007 15:17 To: [hidden email] Subject: Re: Factor scores applied to a diffrent sample In addition to Anita's suggestion, it can also be done with ordinary factor analysis. Factor scores are linear combinations of observed variables, weighted by certain coefficients. The FACTOR command in SPSS produces a table with the component score coefficient matrix, i.e. the coefficients or weights to apply to the observed variables (standardized in z-score form) in order to obtain the factor scores of each case for the various factors. You can use the coefficients to compute factor scores for new cases by way of the COMPUTE command. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Kooij, A.J. van der Enviado el: 31 May 2007 09:50 Para: [hidden email] Asunto: Re: Factor scores applied to a diffrent sample This can be done with CATPCA (Categories package), which is equal to PCA if you choose numeric scaling level for all variables. You can specify the unselected cases as supplementary cases. The solution will be computed for the selected cases and the supplementary cases will be fitted into this solution. Anita van der Kooij Data Theory Group Leiden University -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of KEVIN MANNING Sent: 31 May 2007 14:35 To: [hidden email] Subject: Factor scores applied to a diffrent sample Dear list mates, I have derived six factors from an exploratory analysis using one sample. I want to compare the 6 factor scores from the sample they were derived from to another sample. Is there a way to do this? Since I selected cases when running the factor anlysis, the factor scores are not computed for the remaining cases (my other sample). thank you for your help. Kevin ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** |
|
If you have items with a common response format that are intended to be
parts of scales, a conventional approach would use unit weights. A score is simply the (sum or mean) of the items load cleanly on a factor above an arbitrary cut off (e.g., .4). Items with negative loading, are reflected before summing. Art Kendall Social Research Consultants Kooij, A.J. van der wrote: > Actually, there is a bit more to it when computing factor scores > yourself for cases that were not in the analysis: The standardization of > the variables should be done on the cases included in analysis. For the > cases not in analysis you should substract mean and divide by standard > deviation, using the mean and std. dev. for the cases in analysis. Then > compute the factor score as by multiplying these values with the > loadings, NB: the loadings are found in the Component Matrix table, NOT > in the Factor Score Coefficient Matrix table. > Factor scores are computed as the sum of the standardized variables > multiplied with loadings, AND the result is standardized. So, to know > the mean and std. dev. with which to standardize the factor scores for > the cases not in the analysis, you have to compute the "raw" factor > scores yourself (i.e., scores before standardization) also for the cases > included in the analysis (you can standardize them using Descriptives > and in Factor menu choose Scores, Save as variables, to check your > computations). > > Regards, > Anita > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Hector Maletta > Sent: 31 May 2007 15:17 > To: [hidden email] > Subject: Re: Factor scores applied to a diffrent sample > > > In addition to Anita's suggestion, it can also be done with > ordinary factor analysis. Factor scores are linear combinations of > observed variables, weighted by certain coefficients. The FACTOR command > in SPSS produces a table with the component score coefficient matrix, > i.e. the coefficients or weights to apply to the observed variables > (standardized in z-score form) in order to obtain the factor scores of > each case for the various factors. You can use the coefficients to > compute factor scores for new cases by way of the COMPUTE command. > > Hector > > -----Mensaje original----- > De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de > Kooij, A.J. van der Enviado el: 31 May 2007 09:50 > Para: [hidden email] > Asunto: Re: Factor scores applied to a diffrent sample > > This can be done with CATPCA (Categories package), which is > equal to PCA > if you choose numeric scaling level for all variables. You can > specify > the unselected cases as supplementary cases. The solution will > be > computed for the selected cases and the supplementary cases > will be > fitted into this solution. > > Anita van der Kooij > Data Theory Group > Leiden University > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On > Behalf Of > KEVIN MANNING > Sent: 31 May 2007 14:35 > To: [hidden email] > Subject: Factor scores applied to a diffrent sample > > > Dear list mates, > > I have derived six factors from an exploratory analysis > using one > sample. I want to compare the 6 factor scores from the sample > they were > derived from to another sample. Is there a way to do this? > Since I > selected cases when running the factor anlysis, the factor > scores are > not computed for the remaining cases (my other sample). thank > you for > your help. Kevin > > > ********************************************************************** > This email and any files transmitted with it are confidential > and > intended solely for the use of the individual or entity to whom > they > are addressed. If you have received this email in error please > notify > the system manager. > > ********************************************************************** > > > |
|
I have a data set which has a likert scale 0 to 4 where 0 is a category. I
have been asked to conduct a principal component analysis on the above data set. How should the 0 value be treated? Brian Cooper |
|
By category, do you mean that the zero score is categorized by say, "not at
all" or "strongly disagree?" If that's what you mean by category, then do your principal component analysis (PCA). If, on the other hand, the zero response means some nominal category such as "not applicable" then you'll need to do something different - like recode the zero to missing. Then use a "listwise" missing and run your PCA. However, I'm sure there is a better strategy someone can recommend - I just don't know what it is. Best wishes, Edgar --- Discover Technologies 42020 Koppernick Rd. Suite 204 Canton, MI 48187 (734) 564-4964 (734) 468-0800 fax -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Cooper Sent: Tuesday, July 31, 2007 8:24 AM To: [hidden email] Subject: Analysis of Likert Scale I have a data set which has a likert scale 0 to 4 where 0 is a category. I have been asked to conduct a principal component analysis on the above data set. How should the 0 value be treated? Brian Cooper |
|
Likert scales are ordinal scales with 5 levels or values. The
actual figures representing the levels (-2 to +2, or 0 to 4, or 1 to 5 or whatever) is irrelevant. More important is whether the distance between levels is (approximately) constant (i.e. the difference between 0 and 1 is similar to the distance between 1 and 2, or 2 and 3). If so, you can treat their values as an interval scale, which is usually done almost without thinking. If you have a reasonable feeling that this is son, you may simply consider the items, or the scale resulting from the summation of scores of the Likert-type items, as an interval scale, and apply to it any statistical procedure that requires interval level measurement. Not exactly Kosher (or Halal), but widely done. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Cooper Sent: Tuesday, July 31, 2007 8:24 AM To: [hidden email] Subject: Analysis of Likert Scale I have a data set which has a likert scale 0 to 4 where 0 is a category. I have been asked to conduct a principal component analysis on the above data set. How should the 0 value be treated? Brian Coope |
|
In reply to this post by Edgar F. Johns
Just in case my previous message is read as conveying the idea that
Likert items (and scales based on the summation of Likert-item scores) are legitimate interval level measurements, let me clarify this: As I started by saying in my previous message, Likert items and Likert scales are ORDINAL measures, because the distances between categories of a Likert item are not necessarily given. Widespread use of such items as interval scales (by computing their means and standard deviations, for instance) does not change this a bit. However, a paper doing that is likely not to be rejected for that reason in most journals. Regarding the "true" values of categories (or the true intervals between categories) in Likert five-level items, probably the best way to go is Optimal Scaling (the CATPCA procedure in SPSS), which computes optimal quantitative values for the (ordinal or nominal) categories of items, and also estimates underlying principal components or factors. These quantitative values for the categories of several Likert-type items, derived from their covariance with other similar items purporting to measure the same underlying trait, can also be used for computing a single summary measure also produced by CATPCA (e.g. the first factor scores). These factor scores are in fact summation scales, but they are not a simple sum of the observed variable scores, but a weighted sum, with weights determined by the loadings of every item on the factor, and the eigenvalue of the factor. The results of this approach, however, are sample-dependent. They depends on the inter correlation of items in your sample. For widely used scales (e.g. in Psychology) this could be done on large sample from reference populations and re-calibrated every so many years, to be used as a standard, like usually done with IQ and other standardized measures, but for your own questions in your own survey the resulting values will depend on the strength and scope of your own sample, and the next guy (or girl) may find other values are more appropriate for his/her sample. Hector -----Original Message----- From: Hector Maletta [mailto:[hidden email]] Sent: 31 July 2007 12:41 To: '[hidden email]' Subject: RE: Analysis of Likert Scale Likert scales are ordinal scales with 5 levels or values. The actual figures representing the levels (-2 to +2, or 0 to 4, or 1 to 5 or whatever) is irrelevant. More important is whether the distance between levels is (approximately) constant (i.e. the difference between 0 and 1 is similar to the distance between 1 and 2, or 2 and 3). If so, you can treat their values as an interval scale, which is usually done almost without thinking. If you have a reasonable feeling that this is son, you may simply consider the items, or the scale resulting from the summation of scores of the Likert-type items, as an interval scale, and apply to it any statistical procedure that requires interval level measurement. Not exactly Kosher (or Halal), but widely done. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Cooper Sent: Tuesday, July 31, 2007 8:24 AM To: [hidden email] Subject: Analysis of Likert Scale I have a data set which has a likert scale 0 to 4 where 0 is a category. I have been asked to conduct a principal component analysis on the above data set. How should the 0 value be treated? Brian Coope |
|
CATPCA treats a value of zero as missing, so if zero is not the code for a missing value you have to add 1 to your variables (Compute varnameplus1 = varname + 1.)
When analyzing Likert scale items with CATPCA ordinal scaling level, often the results are not much different from results when treating the items as interval data (CATPCA numerical scaling level or standard PCA with FACTOR command). I advice to do both: CATPCA ordinal and standard PCA and compare the eigenvalues (percentage of VAF) and look at the transformation plots from CATPCA ordinal. If PVAF resulting from CATPCA ordinal is only slightly higher than with standard PCA and the transformation plots look close to linear, it is okay to treat the items as interval data. Also, I would like to clarify some remarks in Hector's message below: "These quantitative values for the categories of several Likert-type items, derived from their covariance with other similar items ..." and "The results of this approach, however, are sample-dependent. They depends on the inter correlation of items in your sample." The parameters of standard PCA (the loadings) are sample dependent; they depend on the covariances or correlations. The parameters of CATPCA (the loadings and the quantified values) are sample depent, depending on the relations between items, which are NOT the covariances or correlations if the optimal scaling level is not specified as numerical (with numerical optimal scaling level the transformed data are simply the standardized variables). With non-numerical scaling levels, CATPCA estimates the parameters from the data itself, in contrast to standard PCA where the loadings are estimated from measures derived from the data (covariances/correlations). After the CATPCA solution is found, correlations can be computed for the transformed (quantified) data, thus, these correlations result from the CATPCA analysis; they are not used in the analysis. The correlation maxtrix of the original variables (thus items treated as interval data) and the correlation matrix of the transformed variables are output of CATPCA and comparing them also gives an indication of "how far from" linear the relations between the items are (treating as interval data not only implies assuming equal spacing between categories, but also assuming that relations between items are linear) Besides being sample dependent, the quantified values are also model depent. That is, with CATPCA the quantified values are optimal for PCA. With for example CATREG (multiple regression with optimal scaling) the quantified values are optimal for regression and thus, when using the same variables in a CATPCA and a CATREG analysis, the quantified variables will be different (with CATREG the parameters (beta's and quantifications) depend on the relations between the independent variables and on the relations of the independent variables with the dependent). Regards, Anita van der Kooij Data Theory Group Leiden University ________________________________ From: SPSSX(r) Discussion on behalf of Hector Maletta Sent: Tue 31/07/2007 18:34 To: [hidden email] Subject: Re: Analysis of Likert Scale - disclaimer Just in case my previous message is read as conveying the idea that Likert items (and scales based on the summation of Likert-item scores) are legitimate interval level measurements, let me clarify this: As I started by saying in my previous message, Likert items and Likert scales are ORDINAL measures, because the distances between categories of a Likert item are not necessarily given. Widespread use of such items as interval scales (by computing their means and standard deviations, for instance) does not change this a bit. However, a paper doing that is likely not to be rejected for that reason in most journals. Regarding the "true" values of categories (or the true intervals between categories) in Likert five-level items, probably the best way to go is Optimal Scaling (the CATPCA procedure in SPSS), which computes optimal quantitative values for the (ordinal or nominal) categories of items, and also estimates underlying principal components or factors. These quantitative values for the categories of several Likert-type items, derived from their covariance with other similar items purporting to measure the same underlying trait, can also be used for computing a single summary measure also produced by CATPCA (e.g. the first factor scores). These factor scores are in fact summation scales, but they are not a simple sum of the observed variable scores, but a weighted sum, with weights determined by the loadings of every item on the factor, and the eigenvalue of the factor. The results of this approach, however, are sample-dependent. They depends on the inter correlation of items in your sample. For widely used scales (e.g. in Psychology) this could be done on large sample from reference populations and re-calibrated every so many years, to be used as a standard, like usually done with IQ and other standardized measures, but for your own questions in your own survey the resulting values will depend on the strength and scope of your own sample, and the next guy (or girl) may find other values are more appropriate for his/her sample. Hector -----Original Message----- From: Hector Maletta [mailto:[hidden email]] Sent: 31 July 2007 12:41 To: '[hidden email]' Subject: RE: Analysis of Likert Scale Likert scales are ordinal scales with 5 levels or values. The actual figures representing the levels (-2 to +2, or 0 to 4, or 1 to 5 or whatever) is irrelevant. More important is whether the distance between levels is (approximately) constant (i.e. the difference between 0 and 1 is similar to the distance between 1 and 2, or 2 and 3). If so, you can treat their values as an interval scale, which is usually done almost without thinking. If you have a reasonable feeling that this is son, you may simply consider the items, or the scale resulting from the summation of scores of the Likert-type items, as an interval scale, and apply to it any statistical procedure that requires interval level measurement. Not exactly Kosher (or Halal), but widely done. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Cooper Sent: Tuesday, July 31, 2007 8:24 AM To: [hidden email] Subject: Analysis of Likert Scale I have a data set which has a likert scale 0 to 4 where 0 is a category. I have been asked to conduct a principal component analysis on the above data set. How should the 0 value be treated? Brian Coope ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** |
|
In reply to this post by Brian Cooper
It is very common to anchor an extent scale at zero where zero means
:none or almost none. It is rather unusual to rate agreement on a scale anchored at zero. what construct is the set of items measuring? What are the value labels for 0 through 4? Art Kendall Social Research Consultants Brian Cooper wrote: > I have a data set which has a likert scale 0 to 4 where 0 is a category. I > have been asked to conduct a principal component analysis on the above data > set. How should the 0 value be treated? > > Brian Cooper > > > |
|
Extent scales are often treated as interval. As long as you assign
values outside 0 to 4 for missing values, I would agree with Hector that you should try CATPCA and since you are creating a scale and are only interested in the common variance doing a PFA type of factor analysis. Art Kendall Social Research Consultants Brian Cooper wrote: > Art, > It is a survey of Australian Rehabilitation Counsellors and their perceived > competence where there are two themes in the questionnaire. One is the > frequency of the activity and the other is the importance of the activity. > For frequency 0 = never and 4 = always and for importance 0 = not important > and 4 = extremely important. The questionnaire is badly designed. For a > bunch of phd's I thought they would have had a clearer idea on instrument > design. > > > -----Original Message----- > From: Art Kendall [mailto:[hidden email]] > Sent: Saturday, 4 August 2007 11:20 PM > To: Brian Cooper > Cc: [hidden email] > Subject: Re: Analysis of Likert Scale > > It is very common to anchor an extent scale at zero where zero means > :none or almost none. It is rather unusual to rate agreement on a scale > anchored at zero. > what construct is the set of items measuring? > What are the value labels for 0 through 4? > > Art Kendall > Social Research Consultants > > > > Brian Cooper wrote: > >> I have a data set which has a likert scale 0 to 4 where 0 is a category. I >> have been asked to conduct a principal component analysis on the above >> > data > >> set. How should the 0 value be treated? >> >> Brian Cooper >> >> >> >> > > > > |
| Free forum by Nabble | Edit this page |
