|
Dear members,
I've got a general problem concerning the structure of SPSS . What I want to do is calculating an index about the benefit of a product group. I've got 5 questions each asking for the benefit of a different product and 7 answer options per question (1=high benefit to 7=low benefit). What I need to do now, is adding up all answers per answer option of the 5 products, for being able to calculate the mean (=my benefit-index of all the products). That's exactly the step where I am failing. (For the "usual" SPSS-way the column for calculating would be 5 times long as before.) I want to be able to research other answers of a survey against the background of the index. Is there anyone who knows what I can do for calculating the index in SPSS and still being able to conduct further analysis against the background of the index. Thanks in advance! Dirk |
|
Dirk,
I do not really see your difficulty. If you simply want to add the score over the five questions, each with a 7-point scale ranging from 7 to 1 in a reverse scale (7 is low and 1 is high), your index might be the sum or the average of the five questions, period. Examples: COMPUTE INDEX=SUM(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5). COMPUTE INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5). I'd recommend the MEAN function, just in case somebody omitted responding some of the five questions. Also, you may specify (as an argument to the MEAN function) the minimum number of non-missing questions required to compute the mean, writing for instance MEAN.3 if you require at least three valid responses for the index to be computed validly. Now, this kind of index is a very simple one, giving the same weight to each of the five questions. You might use a more sophisticated approach such as CATPCA (with input variables defined as ordinal) which is included in the SPSS CATEGORIES module, to generate a principal component analysis for categorical variables, and using the score of the main component as an index. Note that the simple index with COMPUTE assumes that the scores in the questions are interval scales, which may not be right. Assuming they are interval scales you may also apply classical factor analysis with the FACTOR command. These commands, CATPCA or FACTOR, give each of the five questions a different weight, and take their inter-correlation into account. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dirk Sebastian Friedrich Sent: 04 October 2007 12:11 To: [hidden email] Subject: calculating an index Dear members, I've got a general problem concerning the structure of SPSS . What I want to do is calculating an index about the benefit of a product group. I've got 5 questions each asking for the benefit of a different product and 7 answer options per question (1=high benefit to 7=low benefit). What I need to do now, is adding up all answers per answer option of the 5 products, for being able to calculate the mean (=my benefit-index of all the products). That's exactly the step where I am failing. (For the "usual" SPSS-way the column for calculating would be 5 times long as before.) I want to be able to research other answers of a survey against the background of the index. Is there anyone who knows what I can do for calculating the index in SPSS and still being able to conduct further analysis against the background of the index. Thanks in advance! Dirk |
|
I'll leave your main question where Hector left it; you can post
whether that solves your problem or not. This is about one of Hector's points, not to disagree but to expand on it a little. At 12:20 PM 10/4/2007, Hector Maletta wrote: >Your index might be the sum or the average of the five questions, >period. Examples: > > COMPUTE >INDEX=SUM(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5). > COMPUTE >INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5). > > I'd recommend the MEAN function, just in case somebody > omitted responding some of the five questions. Hector's right about preferring MEAN to SUM, but the reason may not be immediately obvious. Both MEAN and SUM deal with missing values by computing as though they weren't there. If, say, QUESTION2 is missing, SUM gives the sums of questions 1, 3, 4 and 5; MEAN gives the mean of the same questions. Omitting question 2 immediately biases SUM downwards by the mean value of QUESTION2. It biases MEAN only to the degree that the mean value of QUESTION2 differs from the mean values of the other questions. That'll certainly be less; if the mean values of the different questions are similar, it may be quite small. |
|
In reply to this post by Dirk Sebastian Friedrich
Thanks Hector, thanks Richard,
unfortunately your proposed solution doesn't give me the solution to my problem. Maybe I haven't been clear enough. In easy words: I want to calculated the mean of more than one columns! What I did with: "COMPUTE INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5)." Was computing the mean per row. Afterwards I could have calculated the mean of all the means for getting an index. But this index wouldn't consider left answers in the single questions and would give each mean per row the same weight. That's exactly what I don't want to do. There must be a way to get the mean of more than one columns which then respects the numbers of answers n. Thanks for your help. Dirk btw: The Index is more complex with weights for different answer options. What I am really doing is asking you about getting the mean of the weights. But it wouldn't change anything on the method. |
|
If I understand it well, it should be something like this:
* lets try to compute the mean of all valid responses bigband to hvymetal in the standard file GSS93 subset. * The weights of variables are as follows: bigband 1 blugrass 2 country 2 blues 1 musicals 3 classicl 1 folk 2 jazz 1 opera 1 rap 1 hvymetal 1 . GET FILE='C:\Program Files\SPSS14\GSS93 subset.sav' /keep id bigband to hvymetal . compute sumwgts = 0 /*sum of weights of valid responses in the row*/. do repe x = bigband to hvymetal / w = 1 2 2 1 3 1 2 1 1 1 1. - compute x = x * w. - if not missing(x) sumwgts = sumwgts + w. end repe. * delete value labels . val lab bigband to hvymetal . compute sumansw = sum(bigband to hvymetal). var lab sumwgts "Sum of weights of valid responses in the row" / sumansw "Sum of weighted answers in the row". compute index = 1. DATASET DECLARE aggregind. AGGREGATE /OUTFILE='aggregind' /BREAK=index /sumwgtss = SUM(sumwgts) /sumansws = SUM(sumansw). DATASET ACTIVATE aggregind. compute index = sumansws / sumwgtss. exe. Of course it can be done in other ways, too. E.g. using Restructure. Have a nice weekend, Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dirk Sebastian Friedrich Sent: Friday, October 05, 2007 10:35 AM To: [hidden email] Subject: Re: calculating an index Thanks Hector, thanks Richard, unfortunately your proposed solution doesn't give me the solution to my problem. Maybe I haven't been clear enough. In easy words: I want to calculated the mean of more than one columns! What I did with: "COMPUTE INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5)." Was computing the mean per row. Afterwards I could have calculated the mean of all the means for getting an index. But this index wouldn't consider left answers in the single questions and would give each mean per row the same weight. That's exactly what I don't want to do. There must be a way to get the mean of more than one columns which then respects the numbers of answers n. Thanks for your help. Dirk btw: The Index is more complex with weights for different answer options. What I am really doing is asking you about getting the mean of the weights. But it wouldn't change anything on the method. _____ Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission. -.- -- |
|
In reply to this post by Dirk Sebastian Friedrich
Definitely see what Jan Spousta wrote, and see if that helps. If not, a
word about terminology. At 04:34 AM 10/5/2007, Dirk Sebastian Friedrich wrote: >In easy words: I want to calculated the mean of more than one columns! Can you confirm what you mean by a 'column'? That's not a standard SPSS term. You may well mean that QUESTION1, QUESTION2, QUESTION3, etc., are 'columns', since that's how they look in the Data Editor; if that's not what you mean, correct me. The SPSS term for these is 'variables'. >What I did with: > >COMPUTE INDEX=MEAN(QUESTION1,QUESTION2,QUESTION3,QUESTION4,QUESTION5). > >was computing the mean per row. Right; that calculates the mean within each 'case' or 'record' (which appears as a 'row' in the Data Editor); the means are in a new variable ('column') named "INDEX". >Afterwards I could have calculated the mean of all the means for >getting an index. But this index wouldn't consider left answers in the >single questions and would give each mean per row the same weight. >That's exactly what I don't want to do. Does Jan's answer solve this for you? If not, I'm afraid I don't understand you, here. . By "left answers", do you mean answers that are omitted? How would you want those considered differently, in the calculation? . "Giving each mean per row the same weight" - what would you want, instead? >There must be a way to get the mean of more than one columns which >then respects the numbers of answers n. What would you want to have it do? Would the value of the mean be different, or do you want the weight to be the number of valid responses? If we aren't understanding you properly, see if this can help. If you have this data, CaseID QUES1 QUES2 QUES3 QUES4 QUES5 001 . 4 1 7 4 002 5 6 6 . 3 003 7 . 3 5 . 004 5 . 5 3 3 what would be the correct answer to the calculation you want? ================================== APPENDIX: Generating the test data ================================== * ................. Test data ..................... . SET RNG = MT /* 'Mersenne twister' random number generator */ . SET MTINDEX = 2809 /* Providence, RI telephone book */ . INPUT PROGRAM. . NUMERIC CaseID (N3). . LOOP CaseID = 1 TO 04. . VECTOR QUES(5,F2). . LOOP #IDX = 1 TO 5. . IF RV.BERNOULLI(0.75) /* 75% of questions are answered QUES(#IDX) = TRUNC(RV.UNIFORM(1,8)). . END LOOP. . END CASE. . END LOOP. END FILE. END INPUT PROGRAM. LIST. |
| Free forum by Nabble | Edit this page |
