Use computed means or computed sums in regression?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Use computed means or computed sums in regression?

lauratilston
Hi there

I'm currently trying to analyse some results for my dissertation but I'm struggling a bit! I have three variables which are all scores from three questionnaires I administered.

I computed both the mean and sum scores for each of the questionnaires using the SUM(V1,V2...) and MEAN(V1,V2...) in the compute variable box. I'm not sure whether I should be using the sum scores or the mean scores in my regression.

I've done a seperate regression for sum and mean scores and I am getting a different set of results everytime, which I'm confused about because I thought the means and sum would amount to the same thing. One of my variables is significant when I use mean scores, yet insignificant when I use sum scores. I just don't know what to do!

Will it have anything to do with missing responses? If one person has missed out one question but filled everyhting else out should I get rid of them?

Thanks

Laura
Reply | Threaded
Open this post in threaded view
|

Re: Use computed means or computed sums in regression?

John F Hall
Laura

Post the syntax you used.  Missing values can cause resulting variables to
differ, as can variations in your COMPUTE statements.  See my tutorials
http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/2.3.1.1a__data_tran
sformations.pdf and the ones listed on page
http://surveyresearch.weebly.com/-3.5-derived-variables-count-and-compute.ht
ml

John Hall

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]
Website: www.surveyresearch.weebly.com
SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
lauratilston
Sent: 20 June 2013 17:06
To: [hidden email]
Subject: Use computed means or computed sums in regression?

Hi there

I'm currently trying to analyse some results for my dissertation but I'm
struggling a bit! I have three variables which are all scores from three
questionnaires I administered.

I computed both the mean and sum scores for each of the questionnaires using
the SUM(V1,V2...) and MEAN(V1,V2...) in the compute variable box. I'm not
sure whether I should be using the sum scores or the mean scores in my
regression.

I've done a seperate regression for sum and mean scores and I am getting a
different set of results everytime, which I'm confused about because I
thought the means and sum would amount to the same thing. One of my
variables is significant when I use mean scores, yet insignificant when I
use sum scores. I just don't know what to do!

Will it have anything to do with missing responses? If one person has missed
out one question but filled everyhting else out should I get rid of them?

Thanks

Laura



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Use-computed-means-or-computed
-sums-in-regression-tp5720827.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Use computed means or computed sums in regression?

Andy W
In reply to this post by lauratilston
To answer your more basic question, yes computed means and sums are exact linear combinations of one another, so for linear regression should not make any difference (just choose whichever offers the preferred interpretation for yourself). If you have k variables, you can write the mean as;

mean = (1/k)*sum

You should be able to multiply point estimates for the mean model by k to get the point estimates for the sum model (this is assuming the dependent variable is the computed measure). Standard errors for coefficient estimates should be suitably scaled as well, so interpretations of statistical significance should not change between models. Given your limited description, the difference is likely attributable to how you treat missing values. Sum scores don't make sense unless you impute the missing values somehow, whereas you could calculate the mean for available items, which would be equivalent to imputing the mean for missing values in the sum score. Caveat emptor you should familiarize yourself with missing data techniques.

I would be able to make more precise statements if you provided the syntax used to compute the means and sum scores (as John already mentioned).
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Use computed means or computed sums in regression?

Rich Ulrich
In reply to this post by lauratilston
[re-posted on June 21: does not show up in Nabble]
Missing?  The Sum and Mean are interchangeable in analyses when
Mean= Sum/N  with fixed N:  a linear transformation. 

Clearly, the Ns vary when you sprinkle in a few Missings, and use
the Average versus Sum of what is left.  So, they are not interchangeable
when you take Missing into account in different ways. 

For a rating scale, the Mean item score is usually what you want
to analyze:  It is interpretable in terms of the scale labels (anchors).
It is comparable across several sub=scales of varying lengths.

The Mean *does*  assume that it is fair to treat all Missings as if
their averages are all the same.  (A Missing item with an especially
extreme mean for other subjects might deserve special treatment,
such as, recoding the Missing to some most justifiable value.)

A report of Purchases or Incomes might be better served by a Sum
than a Mean.   Is your latent dimension a total or an average?

For a Yes/No question, where you are counting attributes, it *can*
be more appropriate to use the count of either Zeroes or Ones:  not
necessarily the count of 1s, which would be the Sum.  I'm thinking in
particular of a Dementia scale where subjects are asked to do things,
and the traditional score is the sum performed, 30 or 31 max.  But a
patient who is blind or bedridden has Not Applicable as responses to
particular items.  This is a scale that I analyzed as the count of reported
deficiencies, the sum of the Zeroes.

--
Rich Ulrich


> Date: Thu, 20 Jun 2013 08:06:27 -0700

> From: [hidden email]
> Subject: Use computed means or computed sums in regression?
> To: [hidden email]
>
> Hi there
>
> I'm currently trying to analyse some results for my dissertation but I'm
> struggling a bit! I have three variables which are all scores from three
> questionnaires I administered.
>
> I computed both the mean and sum scores for each of the questionnaires using
> the SUM(V1,V2...) and MEAN(V1,V2...) in the compute variable box. I'm not
> sure whether I should be using the sum scores or the mean scores in my
> regression.
>
> I've done a seperate regression for sum and mean scores and I am getting a
> different set of results everytime, which I'm confused about because I
> thought the means and sum would amount to the same thing. One of my
> variables is significant when I use mean scores, yet insignificant when I
> use sum scores. I just don't know what to do!
>
> Will it have anything to do with missing responses? If one person has missed
> out one question but filled everyhting else out should I get rid of them?
>
> Thanks
>
> Laura
> ...
> INFO REFCARD