SPSSX Discussion

Re: Factor scores or mean scores? comparing groups in spss

Classic

List

Threaded

3 messages Options

Art Kendall

Jun 03, 2014; 2:11pm

Re: Factor scores or mean scores? comparing groups in spss

I would use scores based on the scoring key. The scoring key describes the set items are to be used as measures of the construct underlying the factor.
Be sure you use PAF. So that you are using only the item variance that is common to the set (factor).
The varimax rotated items that have high enough loading, that load cleanly, and that make substantive sense are listed in a scoring key so you and others can see what you are doing.
{other would be a QA reviewer, a person trying to help, someone who looks up archives, etc.]
example of scoring key:
compute factorname1 = mean.5(item1, Item23,ReversedItem24, item42, ReversedItem55, Item65, item71).
Compute facorname2 = mean.4(item2, reversedItem4, Item23, ReversedItem51, Item59).

In my opinion, YMMV,
If your response scales are dichotomies I would be leery of a scale with fewer than 10 or so items,
if your response scales have 5 values I would be leery of a scale with fewer than 4 items.

I would also run a Reliability on each scale and
(1) double check the scoring key
(2) drop any scale that does not have an alpha at least .7.
(3) drop any item that decreases alpha but anything meaningful.

I have been doing factor analysis and cluster analysis since 1972 and would never rely on a single method of cluster analysis.

On 6/2/2014 9:05 AM, Daria Redko wrote:

Thanks much, Art! I checked my analysis according to your remarks, I think I did it well.
Really needed to confirm if I can use mean scores ( mean(i1, i2 etc belonging to same factor)) rather than spss-computed <regression> scores for cluster analysis.

May I ask you something else since we're having this conversation?

Further in my cluster analysis, which is not turning out well, I started doubting the selection of my clustering variables. I'm afraid some variables (the factor scores mentioned above) are correlated ;the values vary from 0,03 to 0,4 at p<0,01 (by the way, the values are lower if I compare spss-computed factor scores). I know they generally shouldn't be, but don't know if there's an "acceptable" level of collinearity or whatever adjustments could be done... Could you please help with that?

Thanks a lot,

Daria

2014-06-01 15:30 GMT+02:00 Art Kendall <[hidden email]>:

How did you determine the number of factors to retain?

If you used the Kaiser criterion of eigenvalues greater than 1.00, go back and see how your eigenvalues compare to parallel analysis. There are parallel analysis macros for SPSS. This gives you a ballpark number of factors to retain. Then use the criteria Rich gave you to assign items to a scoring key. In addition, to the quantitative criteria also consider whether the items as a set can be thought of as measuring a common construct. Drop any item that does not make sense. Assign an item to only 1 factor.
Drop any factor that you cannot make sense of.

Art Kendall
Social Research Consultants

On 5/31/2014 6:59 PM, [hidden email] wrote:

All that I needed. Thanks a lot, Rich, very kind of you :)

On Sunday, June 1, 2014 12:46:30 AM UTC+2, Rich Ulrich wrote:

On Sat, 31 May 2014 14:55:27 -0700 (PDT), [hidden email] wrote:

Thanks much, Rich!

I would like to specify sth. When you say "factors that are means of the high-loading items", do you suggest a mean of several high-loading items?

say, one factor has 3 high-loading items, then the factor score= mean(i1,i2,i3)

or should I use one item with the highest loading, create a z-score variable and use it in further analysis as a "factor score"?

In my own area, clinical research in psychiatry, almost everyone

uses the (b) factors that are means of the high-loading items from

the rating scales.

score= mean(i1, i2, i3).

Clinical research. A questionaire has a set of items that have

been pre-tested, most often in earlier research (i.e. we crib

from other scales). The items cover some theoretical universe.

Varimax rotation of the PFA produces a set of factors (we hope)

for which each item has one loading over 0.35 or 0.40. If there

are two somewhat-high loadings, then it is still fairly unambiguous

which factor to identify it with if one of them is much higher

than the other, like 0.70 vs 0.40. If the "membership" is still

ambiguous, it makes sense to put the item based on the face-

validity of an assignment based on the content of the items --

since the Factor will be named in terms of the items they contain

(most often, using the name of the one or two strongest items).

One benefit of using Factors is that the average of several items

will have a higher expected reliability (and thus, validity) than

a single item. If a single item is important enough to use as an

independent variable (IV) by itself, I probably would not have

included in the factor analysis that I used to develop scales.

--

Rich Ulrich

... [show rest of quote]

... [show rest of quote]

... [show rest of quote]

... [show rest of quote]

Art Kendall
Social Research Consultants

Weinberg, Jerry

Jun 03, 2014; 2:13pm

Automatic reply: Factor scores or mean scores? comparing groups in spss

I will be on vacation and out of the country until March 30.

Bruce Weaver

Jun 03, 2014; 2:16pm

Re: Factor scores or mean scores? comparing groups in spss

Administrator

In reply to this post by Art Kendall

Art, was this meant for the usenet newsgroup?

https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/R_QrVrdn3Xo

;-)

Art Kendall wrote

I would use scores based on the scoring key. The scoring key describes
the set items are to be used as measures of the construct underlying the
factor.
Be sure you use PAF. So that you are using only the item variance that
is common to the set (factor).
The varimax rotated items that have high enough loading, that load
cleanly, and that make substantive sense are listed in a scoring key so
you and others can see what you are doing.
{other would be a QA reviewer, a person trying to help, someone who
looks up archives, etc.]
example of scoring key:
compute factorname1 = mean.5(item1, Item23,ReversedItem24, item42,
ReversedItem55, Item65, item71).
Compute facorname2 = mean.4(item2, reversedItem4, Item23,
ReversedItem51, Item59).

In my opinion, YMMV,
If your response scales are dichotomies I would be leery of a scale
with fewer than 10 or so items,
if your response scales have 5 values I would be leery of a scale with
fewer than 4 items.

I would also run a Reliability on each scale and
(1) double check the scoring key
(2) drop any scale that does not have an alpha at least .7.
(3) drop any item that decreases alpha but anything meaningful.

I have been doing factor analysis and cluster analysis since 1972 and
would never rely on a single method of cluster analysis.

On 6/2/2014 9:05 AM, Daria Redko wrote:
> Thanks much, Art! I checked my analysis according to your remarks, I
> think I did it well.
> Really needed to confirm if I can use mean scores ( mean(i1, i2 etc
> belonging to same factor)) rather than spss-computed <regression>
> scores for cluster analysis.
>
> May I ask you something else since we're having this conversation?
> Further in my cluster analysis, which is not turning out well, I
> started doubting the selection of my clustering variables. I'm afraid
> some variables (the factor scores mentioned above) are correlated ;the
> values vary from 0,03 to 0,4 at p<0,01 (by the way, the values are
> lower if I compare spss-computed factor scores). I know they generally
> shouldn't be, but don't know if there's an "acceptable" level of
> collinearity or whatever adjustments could be done... Could you please
> help with that?
>
> Thanks a lot,
> Daria
>
>
> 2014-06-01 15:30 GMT+02:00 Art Kendall <[hidden email]
> <mailto:[hidden email]>>:
>
> How did you determine the number of factors to retain?
>
> If you used the Kaiser criterion of eigenvalues greater than 1.00,
> go back and see how your eigenvalues compare to parallel analysis.
> There are parallel analysis macros for SPSS. This gives you a
> ballpark number of factors to retain. Then use the criteria Rich
> gave you to assign items to a scoring key. In addition, to the
> quantitative criteria also consider whether the items as a set can
> be thought of as measuring a common construct. Drop any item that
> does not make sense. Assign an item to only 1 factor.
> Drop any factor that you cannot make sense of.
>
>
>
> Art Kendall
> Social Research Consultants
>
> On 5/31/2014 6:59 PM, [hidden email]
> <mailto:[hidden email]> wrote:
>
> All that I needed. Thanks a lot, Rich, very kind of you :)
>
> On Sunday, June 1, 2014 12:46:30 AM UTC+2, Rich Ulrich wrote:
>
> On Sat, 31 May 2014 14:55:27 -0700 (PDT),
> [hidden email] <mailto:[hidden email]> wrote:
>
>
>
> Thanks much, Rich!
>
>
>
>
> I would like to specify sth. When you say "factors
> that are means of the high-loading items", do you
> suggest a mean of several high-loading items?
>
>
> say, one factor has 3 high-loading items, then the
> factor score= mean(i1,i2,i3)
>
>
> or should I use one item with the highest loading,
> create a z-score variable and use it in further
> analysis as a "factor score"?
>
>
>
>
> In my own area, clinical research in psychiatry,
> almost everyone
>
>
>
>
> uses the (b) factors that are means of the
> high-loading items from
>
>
>
>
> the rating scales.
>
>
>
>
> score= mean(i1, i2, i3).
>
>
>
> Clinical research. A questionaire has a set of items that
> have
>
> been pre-tested, most often in earlier research (i.e. we crib
>
> from other scales). The items cover some theoretical
> universe.
>
>
>
> Varimax rotation of the PFA produces a set of factors (we
> hope)
>
> for which each item has one loading over 0.35 or 0.40. If
> there
>
> are two somewhat-high loadings, then it is still fairly
> unambiguous
>
> which factor to identify it with if one of them is much higher
>
> than the other, like 0.70 vs 0.40. If the "membership" is
> still
>
> ambiguous, it makes sense to put the item based on the face-
>
> validity of an assignment based on the content of the items --
>
> since the Factor will be named in terms of the items they
> contain
>
> (most often, using the name of the one or two strongest
> items).
>
>
>
> One benefit of using Factors is that the average of
> several items
>
> will have a higher expected reliability (and thus,
> validity) than
>
> a single item. If a single item is important enough to
> use as an
>
> independent variable (IV) by itself, I probably would not have
>
> included in the factor analysis that I used to develop scales.
>
>
>
> --
>
> Rich Ulrich
>
>
>
... [show rest of quote]

... [show rest of quote]

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).