Factor scores applied to a diffrent sample

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Factor scores applied to a diffrent sample

KEVIN MANNING
Dear list mates,

   I have derived six factors from an exploratory analysis using one sample.  I want to compare the 6 factor scores from the sample they were derived from to another sample.  Is there a way to do this?  Since I selected cases when running the factor anlysis, the factor scores are not computed for the remaining cases (my other sample).  thank you for your help.
Kevin
Reply | Threaded
Open this post in threaded view
|

Re: Factor scores applied to a diffrent sample

Kooij, A.J. van der
This can be done with CATPCA (Categories package), which is equal to PCA
if you choose numeric scaling level for all variables. You can specify
the unselected cases as supplementary cases. The solution will be
computed for the selected cases and the supplementary cases will be
fitted into this solution.

Anita van der Kooij
Data Theory Group
Leiden University

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
KEVIN MANNING
Sent: 31 May 2007 14:35
To: [hidden email]
Subject: Factor scores applied to a diffrent sample


Dear list mates,

   I have derived six factors from an exploratory analysis using one
sample.  I want to compare the 6 factor scores from the sample they were
derived from to another sample.  Is there a way to do this?  Since I
selected cases when running the factor anlysis, the factor scores are
not computed for the remaining cases (my other sample).  thank you for
your help. Kevin

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Factor scores applied to a diffrent sample

Hector Maletta
         In addition to Anita's suggestion, it can also be done with
ordinary factor analysis. Factor scores are linear combinations of observed
variables, weighted by certain coefficients. The FACTOR command in SPSS
produces a table with the component score coefficient matrix, i.e. the
coefficients or weights to apply to the observed variables (standardized in
z-score form) in order to obtain the factor scores of each case for the
various factors. You can use the coefficients to compute factor scores for
new cases by way of the COMPUTE command.

         Hector

         -----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Kooij, A.J. van der
Enviado el: 31 May 2007 09:50
Para: [hidden email]
Asunto: Re: Factor scores applied to a diffrent sample

         This can be done with CATPCA (Categories package), which is equal
to PCA
         if you choose numeric scaling level for all variables. You can
specify
         the unselected cases as supplementary cases. The solution will be
         computed for the selected cases and the supplementary cases will be
         fitted into this solution.

         Anita van der Kooij
         Data Theory Group
         Leiden University

         -----Original Message-----
         From: SPSSX(r) Discussion [mailto:[hidden email]] On
Behalf Of
         KEVIN MANNING
         Sent: 31 May 2007 14:35
         To: [hidden email]
         Subject: Factor scores applied to a diffrent sample


         Dear list mates,

            I have derived six factors from an exploratory analysis using
one
         sample.  I want to compare the 6 factor scores from the sample they
were
         derived from to another sample.  Is there a way to do this?  Since
I
         selected cases when running the factor anlysis, the factor scores
are
         not computed for the remaining cases (my other sample).  thank you
for
         your help. Kevin


**********************************************************************
         This email and any files transmitted with it are confidential and
         intended solely for the use of the individual or entity to whom
they
         are addressed. If you have received this email in error please
notify
         the system manager.

**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Factor scores applied to a diffrent sample

Kooij, A.J. van der
Actually, there is a bit more to it when computing factor scores
yourself for cases that were not in the analysis: The standardization of
the variables should be done on the cases included in analysis. For the
cases not in analysis you should substract mean and divide by standard
deviation, using the mean and std. dev. for the cases in analysis. Then
compute the factor score as by multiplying these values with the
loadings, NB: the loadings are found in the Component Matrix table, NOT
in the Factor Score Coefficient Matrix table.
Factor scores are computed as the sum of the standardized variables
multiplied with loadings, AND the result is standardized. So, to know
the mean and std. dev. with which to standardize the factor scores for
the cases not in the analysis, you have to compute the "raw" factor
scores yourself (i.e., scores before standardization) also for the cases
included in the analysis (you can standardize them using Descriptives
and in Factor menu choose Scores, Save as variables, to check your
computations).

Regards,
Anita


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 31 May 2007 15:17
To: [hidden email]
Subject: Re: Factor scores applied to a diffrent sample


         In addition to Anita's suggestion, it can also be done with
ordinary factor analysis. Factor scores are linear combinations of
observed variables, weighted by certain coefficients. The FACTOR command
in SPSS produces a table with the component score coefficient matrix,
i.e. the coefficients or weights to apply to the observed variables
(standardized in z-score form) in order to obtain the factor scores of
each case for the various factors. You can use the coefficients to
compute factor scores for new cases by way of the COMPUTE command.

         Hector

         -----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Kooij, A.J. van der Enviado el: 31 May 2007 09:50
Para: [hidden email]
Asunto: Re: Factor scores applied to a diffrent sample

         This can be done with CATPCA (Categories package), which is
equal to PCA
         if you choose numeric scaling level for all variables. You can
specify
         the unselected cases as supplementary cases. The solution will
be
         computed for the selected cases and the supplementary cases
will be
         fitted into this solution.

         Anita van der Kooij
         Data Theory Group
         Leiden University

         -----Original Message-----
         From: SPSSX(r) Discussion [mailto:[hidden email]] On
Behalf Of
         KEVIN MANNING
         Sent: 31 May 2007 14:35
         To: [hidden email]
         Subject: Factor scores applied to a diffrent sample


         Dear list mates,

            I have derived six factors from an exploratory analysis
using one
         sample.  I want to compare the 6 factor scores from the sample
they were
         derived from to another sample.  Is there a way to do this?
Since I
         selected cases when running the factor anlysis, the factor
scores are
         not computed for the remaining cases (my other sample).  thank
you for
         your help. Kevin


**********************************************************************
         This email and any files transmitted with it are confidential
and
         intended solely for the use of the individual or entity to whom
they
         are addressed. If you have received this email in error please
notify
         the system manager.

**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Factor scores applied to a diffrent sample

Art Kendall-2
If you have items with a common response format that are intended to be
parts of scales, a conventional approach would use unit weights.
A score is simply the (sum or mean) of the items load cleanly on a
factor above an arbitrary cut off (e.g., .4).  Items with negative
loading, are reflected before summing.

Art Kendall
Social Research Consultants

Kooij, A.J. van der wrote:

> Actually, there is a bit more to it when computing factor scores
> yourself for cases that were not in the analysis: The standardization of
> the variables should be done on the cases included in analysis. For the
> cases not in analysis you should substract mean and divide by standard
> deviation, using the mean and std. dev. for the cases in analysis. Then
> compute the factor score as by multiplying these values with the
> loadings, NB: the loadings are found in the Component Matrix table, NOT
> in the Factor Score Coefficient Matrix table.
> Factor scores are computed as the sum of the standardized variables
> multiplied with loadings, AND the result is standardized. So, to know
> the mean and std. dev. with which to standardize the factor scores for
> the cases not in the analysis, you have to compute the "raw" factor
> scores yourself (i.e., scores before standardization) also for the cases
> included in the analysis (you can standardize them using Descriptives
> and in Factor menu choose Scores, Save as variables, to check your
> computations).
>
> Regards,
> Anita
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Hector Maletta
> Sent: 31 May 2007 15:17
> To: [hidden email]
> Subject: Re: Factor scores applied to a diffrent sample
>
>
>          In addition to Anita's suggestion, it can also be done with
> ordinary factor analysis. Factor scores are linear combinations of
> observed variables, weighted by certain coefficients. The FACTOR command
> in SPSS produces a table with the component score coefficient matrix,
> i.e. the coefficients or weights to apply to the observed variables
> (standardized in z-score form) in order to obtain the factor scores of
> each case for the various factors. You can use the coefficients to
> compute factor scores for new cases by way of the COMPUTE command.
>
>          Hector
>
>          -----Mensaje original-----
> De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
> Kooij, A.J. van der Enviado el: 31 May 2007 09:50
> Para: [hidden email]
> Asunto: Re: Factor scores applied to a diffrent sample
>
>          This can be done with CATPCA (Categories package), which is
> equal to PCA
>          if you choose numeric scaling level for all variables. You can
> specify
>          the unselected cases as supplementary cases. The solution will
> be
>          computed for the selected cases and the supplementary cases
> will be
>          fitted into this solution.
>
>          Anita van der Kooij
>          Data Theory Group
>          Leiden University
>
>          -----Original Message-----
>          From: SPSSX(r) Discussion [mailto:[hidden email]] On
> Behalf Of
>          KEVIN MANNING
>          Sent: 31 May 2007 14:35
>          To: [hidden email]
>          Subject: Factor scores applied to a diffrent sample
>
>
>          Dear list mates,
>
>             I have derived six factors from an exploratory analysis
> using one
>          sample.  I want to compare the 6 factor scores from the sample
> they were
>          derived from to another sample.  Is there a way to do this?
> Since I
>          selected cases when running the factor anlysis, the factor
> scores are
>          not computed for the remaining cases (my other sample).  thank
> you for
>          your help. Kevin
>
>
> **********************************************************************
>          This email and any files transmitted with it are confidential
> and
>          intended solely for the use of the individual or entity to whom
> they
>          are addressed. If you have received this email in error please
> notify
>          the system manager.
>
> **********************************************************************
>
>
>
Reply | Threaded
Open this post in threaded view
|

Analysis of Likert Scale

Brian Cooper
I have a data set which has a likert scale 0 to 4 where 0 is a category. I
have been asked to conduct a principal component analysis on the above data
set. How should the 0 value be treated?

Brian Cooper
Reply | Threaded
Open this post in threaded view
|

Re: Analysis of Likert Scale

Edgar F. Johns
By category, do you mean that the zero score is categorized by say, "not at
all" or "strongly disagree?" If that's what you mean by category, then do
your principal component analysis (PCA).

If, on the other hand, the zero response means some nominal category such as
"not applicable" then you'll need to do something different - like recode
the zero to missing. Then use a "listwise" missing and run your PCA.
However, I'm sure there is a better strategy someone can recommend - I just
don't know what it is.

Best wishes,
Edgar
---
Discover Technologies
42020 Koppernick Rd.
Suite 204
Canton, MI 48187
(734) 564-4964
(734) 468-0800 fax
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Brian Cooper
Sent: Tuesday, July 31, 2007 8:24 AM
To: [hidden email]
Subject: Analysis of Likert Scale

I have a data set which has a likert scale 0 to 4 where 0 is a category. I
have been asked to conduct a principal component analysis on the above data
set. How should the 0 value be treated?

Brian Cooper
Reply | Threaded
Open this post in threaded view
|

Re: Analysis of Likert Scale

Hector Maletta
         Likert scales are ordinal scales with 5 levels or values. The
actual figures representing the levels (-2 to +2, or 0 to 4, or 1 to 5 or
whatever) is irrelevant.
         More important is whether the distance between levels is
(approximately) constant (i.e. the difference between 0 and 1 is similar to
the distance between 1 and 2, or 2 and 3). If so, you can treat their values
as an interval scale, which is usually done almost without thinking.
         If you have a reasonable feeling that this is son, you may simply
consider the items, or the scale resulting from the summation of scores of
the Likert-type items, as an interval scale, and apply to it any statistical
procedure that requires interval level measurement. Not exactly Kosher (or
Halal), but widely done.

         Hector

         -----Original Message-----
         From: SPSSX(r) Discussion [mailto:[hidden email]] On
Behalf Of
         Brian Cooper
         Sent: Tuesday, July 31, 2007 8:24 AM
         To: [hidden email]
         Subject: Analysis of Likert Scale

         I have a data set which has a likert scale 0 to 4 where 0 is a
category. I
         have been asked to conduct a principal component analysis on the
above data
         set. How should the 0 value be treated?

         Brian Coope
Reply | Threaded
Open this post in threaded view
|

Re: Analysis of Likert Scale - disclaimer

Hector Maletta
In reply to this post by Edgar F. Johns
         Just in case my previous message is read as conveying the idea that
Likert items (and scales based on the summation of Likert-item scores) are
legitimate interval level measurements, let me clarify this:
         As I started by saying in my previous message, Likert items and
Likert scales are ORDINAL measures, because the distances between categories
of a Likert item are not necessarily given. Widespread use of such items as
interval scales (by computing their means and standard deviations, for
instance) does not change this a bit. However, a paper doing that is likely
not to be rejected for that reason in most journals.
         Regarding the "true" values of categories (or the true intervals
between categories) in Likert five-level items, probably the best way to go
is Optimal Scaling (the CATPCA procedure in SPSS), which computes optimal
quantitative values for the (ordinal or nominal) categories of items, and
also estimates underlying principal components or factors. These
quantitative values for the categories of several Likert-type items, derived
from their covariance with other similar items purporting to measure the
same underlying trait, can also be used for computing a single summary
measure also produced by CATPCA (e.g. the first factor scores). These factor
scores are in fact summation scales, but they are not a simple sum of the
observed variable scores, but a weighted sum, with weights determined by the
loadings of every item on the factor, and the eigenvalue of the factor.
         The results of this approach, however, are sample-dependent. They
depends on the inter correlation of items in your sample. For widely used
scales (e.g. in Psychology) this could be done on large sample from
reference populations and re-calibrated every so many years, to be used as a
standard, like usually done with IQ and other standardized measures, but for
your own questions in your own survey the resulting values will depend on
the strength and scope of your own sample, and the next guy (or girl) may
find other values are more appropriate for his/her sample.

         Hector


         -----Original Message-----
From: Hector Maletta [mailto:[hidden email]]
Sent: 31 July 2007 12:41
To: '[hidden email]'
Subject: RE: Analysis of Likert Scale


                  Likert scales are ordinal scales with 5 levels or values.
The actual figures representing the levels (-2 to +2, or 0 to 4, or 1 to 5
or whatever) is irrelevant.
                  More important is whether the distance between levels is
(approximately) constant (i.e. the difference between 0 and 1 is similar to
the distance between 1 and 2, or 2 and 3). If so, you can treat their values
as an interval scale, which is usually done almost without thinking.
                  If you have a reasonable feeling that this is son, you may
simply consider the items, or the scale resulting from the summation of
scores of the Likert-type items, as an interval scale, and apply to it any
statistical procedure that requires interval level measurement. Not exactly
Kosher (or Halal), but widely done.

                  Hector

                  -----Original Message-----
                  From: SPSSX(r) Discussion
[mailto:[hidden email]] On Behalf Of
                  Brian Cooper
                  Sent: Tuesday, July 31, 2007 8:24 AM
                  To: [hidden email]
                  Subject: Analysis of Likert Scale

                  I have a data set which has a likert scale 0 to 4 where 0
is a category. I
                  have been asked to conduct a principal component analysis
on the above data
                  set. How should the 0 value be treated?

                  Brian Coope
Reply | Threaded
Open this post in threaded view
|

Re: Analysis of Likert Scale - disclaimer

Kooij, A.J. van der
CATPCA treats a value of zero as missing, so if zero is not the code for a missing value you have to add 1 to your variables (Compute varnameplus1 = varname + 1.)
 
When analyzing Likert scale items with CATPCA ordinal scaling level, often the results are not much different from results when treating the items as interval data (CATPCA numerical scaling level or standard PCA with FACTOR command). I advice to do both: CATPCA ordinal and standard PCA and compare the eigenvalues (percentage of VAF) and look at the transformation plots from CATPCA ordinal. If PVAF resulting from CATPCA ordinal is only slightly higher than with standard PCA and the transformation plots look close to linear, it is okay to treat the items as interval data.
 
Also, I would like to clarify some remarks in Hector's message below:
"These quantitative values for the categories of several Likert-type items, derived
from their covariance with other similar items ..."
and
"The results of this approach, however, are sample-dependent. They
depends on the inter correlation of items in your sample."

The parameters of standard PCA (the loadings) are sample dependent; they depend on the covariances or correlations. The parameters of CATPCA (the loadings and the quantified values) are sample depent, depending on the relations between items, which are NOT the covariances or correlations if the optimal scaling level is not specified as numerical (with numerical optimal scaling level the transformed data are simply the standardized variables). With non-numerical scaling levels, CATPCA estimates the parameters from the data itself, in contrast to standard PCA where the loadings are estimated from measures derived from the data (covariances/correlations). After the CATPCA solution is found, correlations can be computed for the transformed (quantified) data, thus, these correlations result from the CATPCA analysis; they are not used in the analysis. The correlation maxtrix of the original variables (thus items treated as interval data) and the correlation matrix of the transformed variables are output of CATPCA and comparing them also gives an indication of "how far from" linear the relations between the items are (treating as interval data not only implies assuming equal spacing between categories, but also assuming that relations between items are linear)

Besides being sample dependent, the quantified values are also model depent. That is, with CATPCA the quantified values are optimal for PCA. With for example CATREG (multiple regression with optimal scaling) the quantified values are optimal for regression and thus, when using the same variables in a CATPCA and a CATREG analysis, the quantified variables will be different (with CATREG the parameters (beta's and quantifications) depend on the relations between the independent variables and on the relations of the independent variables with the dependent).

Regards,

Anita van der Kooij

Data Theory Group

Leiden University

________________________________

From: SPSSX(r) Discussion on behalf of Hector Maletta
Sent: Tue 31/07/2007 18:34
To: [hidden email]
Subject: Re: Analysis of Likert Scale - disclaimer



         Just in case my previous message is read as conveying the idea that
Likert items (and scales based on the summation of Likert-item scores) are
legitimate interval level measurements, let me clarify this:
         As I started by saying in my previous message, Likert items and
Likert scales are ORDINAL measures, because the distances between categories
of a Likert item are not necessarily given. Widespread use of such items as
interval scales (by computing their means and standard deviations, for
instance) does not change this a bit. However, a paper doing that is likely
not to be rejected for that reason in most journals.
         Regarding the "true" values of categories (or the true intervals
between categories) in Likert five-level items, probably the best way to go
is Optimal Scaling (the CATPCA procedure in SPSS), which computes optimal
quantitative values for the (ordinal or nominal) categories of items, and
also estimates underlying principal components or factors. These
quantitative values for the categories of several Likert-type items, derived
from their covariance with other similar items purporting to measure the
same underlying trait, can also be used for computing a single summary
measure also produced by CATPCA (e.g. the first factor scores). These factor
scores are in fact summation scales, but they are not a simple sum of the
observed variable scores, but a weighted sum, with weights determined by the
loadings of every item on the factor, and the eigenvalue of the factor.
         The results of this approach, however, are sample-dependent. They
depends on the inter correlation of items in your sample. For widely used
scales (e.g. in Psychology) this could be done on large sample from
reference populations and re-calibrated every so many years, to be used as a
standard, like usually done with IQ and other standardized measures, but for
your own questions in your own survey the resulting values will depend on
the strength and scope of your own sample, and the next guy (or girl) may
find other values are more appropriate for his/her sample.

         Hector


         -----Original Message-----
From: Hector Maletta [mailto:[hidden email]]
Sent: 31 July 2007 12:41
To: '[hidden email]'
Subject: RE: Analysis of Likert Scale


                  Likert scales are ordinal scales with 5 levels or values.
The actual figures representing the levels (-2 to +2, or 0 to 4, or 1 to 5
or whatever) is irrelevant.
                  More important is whether the distance between levels is
(approximately) constant (i.e. the difference between 0 and 1 is similar to
the distance between 1 and 2, or 2 and 3). If so, you can treat their values
as an interval scale, which is usually done almost without thinking.
                  If you have a reasonable feeling that this is son, you may
simply consider the items, or the scale resulting from the summation of
scores of the Likert-type items, as an interval scale, and apply to it any
statistical procedure that requires interval level measurement. Not exactly
Kosher (or Halal), but widely done.

                  Hector

                  -----Original Message-----
                  From: SPSSX(r) Discussion
[mailto:[hidden email]] On Behalf Of
                  Brian Cooper
                  Sent: Tuesday, July 31, 2007 8:24 AM
                  To: [hidden email]
                  Subject: Analysis of Likert Scale

                  I have a data set which has a likert scale 0 to 4 where 0
is a category. I
                  have been asked to conduct a principal component analysis
on the above data
                  set. How should the 0 value be treated?

                  Brian Coope



**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Analysis of Likert Scale

Art Kendall-2
In reply to this post by Brian Cooper
It is very common to anchor an extent scale at zero where zero means
:none or almost none.  It is rather unusual to rate agreement on a scale
anchored at zero.
what construct is the set of items measuring?
What are the value labels for 0 through 4?

Art Kendall
Social Research Consultants



Brian Cooper wrote:
> I have a data set which has a likert scale 0 to 4 where 0 is a category. I
> have been asked to conduct a principal component analysis on the above data
> set. How should the 0 value be treated?
>
> Brian Cooper
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Analysis of Likert Scale

Art Kendall-2
Extent scales are often treated as interval.  As long as you assign
values outside 0 to 4 for missing values, I would agree with Hector that
you should try CATPCA and since you are creating a scale and are only
interested in the common variance doing a PFA type of factor analysis.

Art Kendall
Social Research Consultants

Brian Cooper wrote:

> Art,
> It is a survey of Australian Rehabilitation Counsellors and their perceived
> competence where there are two themes in the questionnaire. One is the
> frequency of the activity and the other is the importance of the activity.
> For frequency 0 = never and 4 = always and for importance 0 = not important
> and 4 = extremely important. The questionnaire is badly designed. For a
> bunch of phd's I thought they would have had a clearer idea on instrument
> design.
>
>
> -----Original Message-----
> From: Art Kendall [mailto:[hidden email]]
> Sent: Saturday, 4 August 2007 11:20 PM
> To: Brian Cooper
> Cc: [hidden email]
> Subject: Re: Analysis of Likert Scale
>
> It is very common to anchor an extent scale at zero where zero means
> :none or almost none.  It is rather unusual to rate agreement on a scale
> anchored at zero.
> what construct is the set of items measuring?
> What are the value labels for 0 through 4?
>
> Art Kendall
> Social Research Consultants
>
>
>
> Brian Cooper wrote:
>
>> I have a data set which has a likert scale 0 to 4 where 0 is a category. I
>> have been asked to conduct a principal component analysis on the above
>>
> data
>
>> set. How should the 0 value be treated?
>>
>> Brian Cooper
>>
>>
>>
>>
>
>
>
>