SPSSX Discussion

calculating an index

Classic

List

Threaded

6 messages Options

Dirk Sebastian Friedrich

calculating an index

Dear members,

I've got a general problem concerning the structure of SPSS .

What I want to do is calculating an index about the benefit of a product
group.

I've got 5 questions each asking for the benefit of a different product
and 7 answer options per question (1=high benefit to 7=low benefit).

What I need to do now, is adding up all answers per answer option of the
5 products, for being able to calculate the mean (=my benefit-index of
all the products).

That's exactly the step where I am failing. (For the "usual" SPSS-way
the column for calculating would be 5 times long as before.)

I want to be able to research other answers of a survey against the
background of the index.

Is there anyone who knows what I can do for calculating the index in
SPSS and still being able to conduct further analysis against the
background of the index.

Thanks in advance!

Dirk

Hector Maletta

Re: calculating an index

Dirk,
I do not really see your difficulty. If you simply want to add the
score over the five questions, each with a 7-point scale ranging from 7 to 1
in a reverse scale (7 is low and 1 is high), your index might be the sum or
the average of the five questions, period. Examples:

COMPUTE
INDEX=SUM(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5).
COMPUTE
INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5).

I'd recommend the MEAN function, just in case somebody omitted
responding some of the five questions. Also, you may specify (as an argument
to the MEAN function) the minimum number of non-missing questions required
to compute the mean, writing for instance MEAN.3 if you require at least
three valid responses for the index to be computed validly.

Now, this kind of index is a very simple one, giving the same
weight to each of the five questions. You might use a more sophisticated
approach such as CATPCA (with input variables defined as ordinal) which is
included in the SPSS CATEGORIES module, to generate a principal component
analysis for categorical variables, and using the score of the main
component as an index. Note that the simple index with COMPUTE assumes that
the scores in the questions are interval scales, which may not be right.
Assuming they are interval scales you may also apply classical factor
analysis with the FACTOR command. These commands, CATPCA or FACTOR, give
each of the five questions a different weight, and take their
inter-correlation into account.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Dirk Sebastian Friedrich
Sent: 04 October 2007 12:11
To: [hidden email]
Subject: calculating an index

Dear members,

I've got a general problem concerning the structure of SPSS .

What I want to do is calculating an index about the benefit of a
product
group.

I've got 5 questions each asking for the benefit of a different
product
and 7 answer options per question (1=high benefit to 7=low
benefit).

What I need to do now, is adding up all answers per answer option
of the
5 products, for being able to calculate the mean (=my benefit-index
of
all the products).

That's exactly the step where I am failing. (For the "usual"
SPSS-way
the column for calculating would be 5 times long as before.)

I want to be able to research other answers of a survey against the
background of the index.

Is there anyone who knows what I can do for calculating the index
in
SPSS and still being able to conduct further analysis against the
background of the index.

Thanks in advance!

Dirk

Richard Ristow

Re: calculating an index

I'll leave your main question where Hector left it; you can post
whether that solves your problem or not. This is about one of Hector's
points, not to disagree but to expand on it a little.

At 12:20 PM 10/4/2007, Hector Maletta wrote:

>Your index might be the sum or the average of the five questions,
>period. Examples:
>
> COMPUTE
>INDEX=SUM(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5).
> COMPUTE
>INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5).
>
> I'd recommend the MEAN function, just in case somebody
> omitted responding some of the five questions.

Hector's right about preferring MEAN to SUM, but the reason may not be
immediately obvious.

Both MEAN and SUM deal with missing values by computing as though they
weren't there. If, say, QUESTION2 is missing, SUM gives the sums of
questions 1, 3, 4 and 5; MEAN gives the mean of the same questions.

Omitting question 2 immediately biases SUM downwards by the mean value
of QUESTION2. It biases MEAN only to the degree that the mean value of
QUESTION2 differs from the mean values of the other questions. That'll
certainly be less; if the mean values of the different questions are
similar, it may be quite small.

Dirk Sebastian Friedrich

Re: calculating an index

In reply to this post by Dirk Sebastian Friedrich

Thanks Hector, thanks Richard,

unfortunately your proposed solution doesn't give me the solution to my
problem. Maybe I haven't been clear enough.

In easy words: I want to calculated the mean of more than one columns!

What I did with: "COMPUTE
INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5)." Was
computing the mean per row. Afterwards I could have calculated the mean
of all the means for getting an index. But this index wouldn't consider
left answers in the single questions and would give each mean per row
the same weight. That's exactly what I don't want to do.

There must be a way to get the mean of more than one columns which then
respects the numbers of answers n.

Thanks for your help.

Dirk

btw: The Index is more complex with weights for different answer
options. What I am really doing is asking you about getting the mean of
the weights. But it wouldn't change anything on the method.

Spousta Jan

Re: calculating an index

If I understand it well, it should be something like this:

* lets try to compute the mean of all valid responses bigband to hvymetal in the standard file GSS93 subset.
* The weights of variables are as follows:
bigband 1
blugrass 2
country 2
blues 1
musicals 3
classicl 1
folk 2
jazz 1
opera 1
rap 1
hvymetal 1
.
GET FILE='C:\Program Files\SPSS14\GSS93 subset.sav' /keep id bigband to hvymetal .
compute sumwgts = 0 /*sum of weights of valid responses in the row*/.
do repe x = bigband to hvymetal / w = 1 2 2 1 3 1 2 1 1 1 1.
- compute x = x * w.
- if not missing(x) sumwgts = sumwgts + w.
end repe.
* delete value labels .
val lab bigband to hvymetal .
compute sumansw = sum(bigband to hvymetal).
var lab sumwgts "Sum of weights of valid responses in the row" / sumansw "Sum of weighted answers in the row".
compute index = 1.

DATASET DECLARE aggregind.
AGGREGATE /OUTFILE='aggregind' /BREAK=index /sumwgtss = SUM(sumwgts) /sumansws = SUM(sumansw).
DATASET ACTIVATE aggregind.
compute index = sumansws / sumwgtss.
exe.

Of course it can be done in other ways, too. E.g. using Restructure.

Have a nice weekend,

Jan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dirk Sebastian Friedrich
Sent: Friday, October 05, 2007 10:35 AM
To: [hidden email]
Subject: Re: calculating an index

Thanks Hector, thanks Richard,

unfortunately your proposed solution doesn't give me the solution to my problem. Maybe I haven't been clear enough.

In easy words: I want to calculated the mean of more than one columns!

What I did with: "COMPUTE
INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5)." Was computing the mean per row. Afterwards I could have calculated the mean of all the means for getting an index. But this index wouldn't consider left answers in the single questions and would give each mean per row the same weight. That's exactly what I don't want to do.

There must be a way to get the mean of more than one columns which then respects the numbers of answers n.

Thanks for your help.

Dirk

btw: The Index is more complex with weights for different answer options. What I am really doing is asking you about getting the mean of the weights. But it wouldn't change anything on the method.

_____

Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem.

This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission.

-.- --

Richard Ristow

Re: calculating an index

In reply to this post by Dirk Sebastian Friedrich

Definitely see what Jan Spousta wrote, and see if that helps. If not, a
word about terminology.

At 04:34 AM 10/5/2007, Dirk Sebastian Friedrich wrote:

>In easy words: I want to calculated the mean of more than one columns!

Can you confirm what you mean by a 'column'? That's not a standard SPSS
term. You may well mean that QUESTION1, QUESTION2, QUESTION3, etc., are
'columns', since that's how they look in the Data Editor; if that's not
what you mean, correct me. The SPSS term for these is 'variables'.

>What I did with:
>
>COMPUTE INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5).
>
>was computing the mean per row.

Right; that calculates the mean within each 'case' or 'record' (which
appears as a 'row' in the Data Editor); the means are in a new variable
('column') named "INDEX".

>Afterwards I could have calculated the mean of all the means for
>getting an index. But this index wouldn't consider left answers in the
>single questions and would give each mean per row the same weight.
>That's exactly what I don't want to do.

Does Jan's answer solve this for you? If not, I'm afraid I don't
understand you, here.

. By "left answers", do you mean answers that are omitted? How would
you want those considered differently, in the calculation?

. "Giving each mean per row the same weight" - what would you want,
instead?

>There must be a way to get the mean of more than one columns which
>then respects the numbers of answers n.

What would you want to have it do? Would the value of the mean be
different, or do you want the weight to be the number of valid
responses?

If we aren't understanding you properly, see if this can help. If you
have this data,

CaseID QUES1 QUES2 QUES3 QUES4 QUES5

001 . 4 1 7 4
002 5 6 6 . 3
003 7 . 3 5 .
004 5 . 5 3 3

what would be the correct answer to the calculation you want?

==================================
APPENDIX: Generating the test data
==================================
* ................. Test data ..................... .
SET RNG = MT /* 'Mersenne twister' random number generator */ .
SET MTINDEX = 2809 /* Providence, RI telephone book */ .

INPUT PROGRAM.
. NUMERIC CaseID (N3).
. LOOP CaseID = 1 TO 04.
. VECTOR QUES(5,F2).
. LOOP #IDX = 1 TO 5.
. IF RV.BERNOULLI(0.75) /* 75% of questions are answered
QUES(#IDX) = TRUNC(RV.UNIFORM(1,8)).
. END LOOP.
. END CASE.
. END LOOP.
END FILE.
END INPUT PROGRAM.

LIST.