Computing Variables - Missing Data

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Computing Variables - Missing Data

Courtney M. Cronley
I am having trouble computing a new variable due to missing data. I am trying to add up all of the values across 6 variables using the following syntax:

compute alcfri12_17 = sum(alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17).
execute.

My problem is that in the existing variables, I have cases with a -6 value, indicating missing data. For all of those cases, the syntax refuses to compute the new variables and leaves the cell blank. Does anyone have an idea about why this problem is occurring and how to fix it?

Thanks in advance for any advice one can offer.

Courtney Cronley, Ph.D.
Postdoctoral Fellow
Center of Alcohol Studies
Rutgers University
[hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing Variables - Missing Data

Maguin, Eugene
Courtney,

Read up on the sum function in the syntax references. As I read it, the
default value for the .n parameter is 1 so that a sum is computed so long as
one variable has a valid value. Thus, I'd guess that you must have cases
where all variables are missing, either user or sysmis. If you haven't done
this already, it might be a good idea to do a missing values analysis for
the variables involved in that sum.

My preference is to use the mean function rather than the sum because the
mean will have the same theoretical range as component variables for all
cases whereas the theoretical range of sum for a case depends on number of
variables with valid data for that case.

Gene Maguin


>>I am having trouble computing a new variable due to missing data. I am
trying to add up all of the values across 6 variables using the following
syntax:

compute alcfri12_17 =
sum(alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17).
execute.

My problem is that in the existing variables, I have cases with a -6 value,
indicating missing data. For all of those cases, the syntax refuses to
compute the new variables and leaves the cell blank. Does anyone have an
idea about why this problem is occurring and how to fix it?

Thanks in advance for any advice one can offer.

Courtney Cronley, Ph.D.
Postdoctoral Fellow
Center of Alcohol Studies
Rutgers University
[hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing Variables - Missing Data

Daniel J. Robertson
In reply to this post by Courtney M. Cronley
Try separating your SUM() arguments with a comma instead of '+', e.g.,

compute alcfri12_17 = sum(alcfri12, alcfri13, alcfri14, alcfri15, alcfri16, alcfri17) .

Addition in SPSS using SUM() does indeed ignore missing values, whereas constructing an addition string using '+' will not and SPSS will refuse to compute the sum if any of the variables are missing. SUM() arguments must be separated by a comma, but, interestingly, the SUM() function is evidently flexible enough to respect more complex statements like SUM(Var1+Var2, Var3-Var4, Var5*Var6). By separating your variables inside the SUM() with '+' you were wrapping a simple addition string inside a SUM() function, which is why null values were returned when there were missing values.

--
Daniel Robertson, Ph.D.
Senior Research and Planning Associate
Institutional Research and Planning
Cornell University / irp.cornell.edu

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Courtney M. Cronley
Sent: Tuesday, January 26, 2010 12:28 PM
To: [hidden email]
Subject: [SPSSX-L] Computing Variables - Missing Data

I am having trouble computing a new variable due to missing data. I am trying to add up all of the values across 6 variables using the following syntax:

compute alcfri12_17 = sum(alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17).
execute.

My problem is that in the existing variables, I have cases with a -6 value, indicating missing data. For all of those cases, the syntax refuses to compute the new variables and leaves the cell blank. Does anyone have an idea about why this problem is occurring and how to fix it?

Thanks in advance for any advice one can offer.

Courtney Cronley, Ph.D.
Postdoctoral Fellow
Center of Alcohol Studies
Rutgers University
[hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing Variables - Missing Data

Lemon, John S.
In reply to this post by Courtney M. Cronley

Courtney

 

The reason you may be getting so many missing values is that I believe SPSS is interpreting what is between the brackets (alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17 ) as numeric expression. The definition of SUM is:

 

SUM(numexpr,numexpr[,..]). Numeric. Returns the sum of its arguments that have valid, nonmissing values. This function requires two or more arguments, which must be numeric. You can specify a minimum number of valid arguments for this function to be evaluated.

 

So it is treating (alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17 ) as a single numeric expression and adding them all together; if one is missing the result is likely to be set to missing so why not try:

 

COMPUTE ALCFRI12_17 = SUM (alcfri12,alcfri13,alcfri14,alcfri15,alcfri16,alcfri17 ).

 

To see if that works.

 

Best Wishes

 

John S. Lemon

DIT ( Directorate of Information Technology ) - Student Liaison Officer

University of Aberdeen

Edward Wright Building: Room G86a ( please note new room )

 

Tel:  +44 1224 273350

Fax: +44 1224 273372

 

Diary ( Free / Busy )

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Courtney M. Cronley
Sent: 26 January 2010 17:28
To: [hidden email]
Subject: Computing Variables - Missing Data

 

I am having trouble computing a new variable due to missing data. I am trying to add up all of the values across 6 variables using the following syntax:

 

compute alcfri12_17 = sum(alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17).

execute.

 

My problem is that in the existing variables, I have cases with a -6 value, indicating missing data. For all of those cases, the syntax refuses to compute the new variables and leaves the cell blank. Does anyone have an idea about why this problem is occurring and how to fix it?

 

Thanks in advance for any advice one can offer.

 

Courtney Cronley, Ph.D.

Postdoctoral Fellow

Center of Alcohol Studies

Rutgers University

[hidden email]

 

=====================

To manage your subscription to SPSSX-L, send a message to

[hidden email] (not to SPSSX-L), with no body text except the

command. To leave the list, send the command

SIGNOFF SPSSX-L

For a list of commands to manage subscriptions, send the command

INFO REFCARD



The University of Aberdeen is a charity registered in Scotland, No SC013683.
Reply | Threaded
Open this post in threaded view
|

Re: Computing Variables - Missing Data

Richard Ristow
In reply to this post by Courtney M. Cronley
At 12:27 PM 1/26/2010, Courtney M. Cronley wrote:

I am trying to add up all of the values across 6 variables using the following syntax:
compute alcfri12_17 = sum(alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17).

My problem is that in the existing variables, I have cases with a -6 value, indicating missing data. For all of those cases, the syntax refuses to compute the new variables and leaves the cell blank. Does anyone have an idea about why this problem is occurring and how to fix it?

What do you want to have happen, in the instances where you're getting "blank cells", i.e. system-missing values?

Gene Maguin is, of course, right about the behavior of SUM: by default, it sums the values of all its arguments that aren't missing. If, as I (and Gene) assume, you've declared -6 as a user-missing value, any '-6' values will not be included in the sum -- did you, by any chance, want them to be included?

-Good luck,
 Richard
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing Variables - Missing Data

Melissa Ives
Courtney,
 
It would help to know what the values of the items are and what the sum is intended to show.
 
If you use the following (note the .6 and the comma instead of '+'),
compute alcfri12_17 = sum.6(alcfri12,alcfri13,alcfri14,alcfri15,alcfri16,alcfri17).
it will calculate only those with valid answers to all 6 of the original items.
 
If you don't use the .6, (but keep the commas as the syntax expects),
compute alcfri12_17 = sum(alcfri12,alcfri13,alcfri14,alcfri15,alcfri16,alcfri17).
it will calculate any with a valid answer to at least one original item.   I'd be concerned that this would be misleading--depending on what the values are for your original items.
 
If you are calculating a scale (i.e. the items have a decent alpha) that needs to be in the same metric as the original and be the sum of the total not a mean,
Try something like this:
compute alcfri12_17 = Rnd(mean.3(alcfri12,alcfri13,alcfri14,alcfri15,alcfri16,alcfri17)*6).

Thit takes the mean of the items - provided there are at least 3 valid answers, then rounds that answer to return to the original metric (e.g. days, counts, etc.).  By multiplying this response by the number of items in the scale, you have essentially replaced any missing items with the mean of the total and summed the result.  AGAIN--ONLY do this IF they items are internally consistent (alpha>.7).
Melissa

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow
Sent: Tuesday, January 26, 2010 5:19 PM
To: [hidden email]
Subject: Re: [SPSSX-L] Computing Variables - Missing Data

At 12:27 PM 1/26/2010, Courtney M. Cronley wrote:

I am trying to add up all of the values across 6 variables using the following syntax:
compute alcfri12_17 = sum(alcfri12+alcfri13+alcfri14+alcfri15+alcfri16+alcfri17).

My problem is that in the existing variables, I have cases with a -6 value, indicating missing data. For all of those cases, the syntax refuses to compute the new variables and leaves the cell blank. Does anyone have an idea about why this problem is occurring and how to fix it?

What do you want to have happen, in the instances where you're getting "blank cells", i.e. system-missing values?

Gene Maguin is, of course, right about the behavior of SUM: by default, it sums the values of all its arguments that aren't missing. If, as I (and Gene) assume, you've declared -6 as a user-missing value, any '-6' values will not be included in the sum -- did you, by any chance, want them to be included?

-Good luck,
 Richard
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.