Aggregating with missing data

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Aggregating with missing data

Marco Venus
Dear all,

I have a basic question that hopefully someone is willing to answer: when
using the aggregate (mean) function in SPSS, cells that contain missing data
become empty. Does anyone know how to solve this problem? Thus, a cell that
should contain the mean of multiple cells (one of which is empty/missing),
turns to zero because it contains one missing datum.

Regards,

Marco
Reply | Threaded
Open this post in threaded view
|

Re: Aggregating with missing data

Richard Ristow
At 03:30 AM 7/14/2007, Marco wrote:

>When using the aggregate (mean) function in SPSS, cells that contain
>missing data become empty. Thus, a cell that should contain the mean
>of multiple cells (one of which is empty/missing), turns to zero
>because it contains one missing datum.

I'm not sure what's happening to you, but you shouldn't be seeing what
you say you're seeing.

 From your description, it sounds like you're doing one of two things:
a) Using the MEAN function with command AGGREGATE to average over a set
of variables
b) Using the MEAN function in the transformation language to average
over a set of variables.

BOTH of those, however, ignore missing values when averaging, and take
the mean of the non-missing values; they don't make a value 0 because
there's a missing value in the list. (That would be a very dangerous
thing to do anyway.) So I'm not sure what's happening.

Could you post the syntax, some test data, what output you get, and
tell us what output you want?

.....................................
Here are demonstrations of averaging across cases with AGGREGATE, and
averaging across variables. It's SPSS 15 draft output (WRR-not saved
separately).
.....................................
Using AGGREGATE to average over cases:
List
|-----------------------------|---------------------------|
|Output Created               |25-JUL-2007 19:16:03       |
|-----------------------------|---------------------------|
[Aggregate]

Group Value

     1     1
     1     2
     1     3
     2     4
     2     5
     2     .
     2     7
     3     .
     3     9
     3    10


Number of cases read:  10    Number of cases listed:  10


AGGREGATE OUTFILE=*
   /BREAK=GROUP
   /Members 'Size of group' = NU
   /MEAN    'Mean of "value"' = MEAN(VALUE).

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |25-JUL-2007 19:16:03       |
|-----------------------------|---------------------------|
Group Members     MEAN

     1       3     2.00
     2       4     5.33
     3       3     9.50

Number of cases read:  3    Number of cases listed:  3
.....................................
Using MEAN to average over variables:
List
|-----------------------------|---------------------------|
|Output Created               |25-JUL-2007 19:16:04       |
|-----------------------------|---------------------------|
[Wide]

Group Members Value.1 Value.2 Value.3 Value.4

     1      3       1       2       3       .
     2      4       4       5       .       7
     3      3       .       9      10       .


Number of cases read:  3    Number of cases listed:  3


NUMERIC Mean (F6.2).
COMPUTE Mean = MEAN(Value.1 TO Value.4).
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |25-JUL-2007 19:16:26       |
|-----------------------------|---------------------------|
[Wide]

Group Members Value.1 Value.2 Value.3 Value.4   Mean

     1      3       1       2       3       .    2.00
     2      4       4       5       .       7    5.33
     3      3       .       9      10       .    9.50

Number of cases read:  3    Number of cases listed:  3

===================
APPENDIX:  All code
===================
I keyed the test data into the Data Editor; however, it can be
recovered from the LIST output fairly easily. Here's all the code:

DATASET ACTIVATE TestData.
DATASET COPY     Aggregate.
DATASET ACTIVATE Aggregate WINDOW=FRONT.

LIST.

AGGREGATE OUTFILE=*
   /BREAK=GROUP
   /Members 'Size of group' = NU
   /MEAN    'Mean of "value"' = MEAN(VALUE).

LIST.

DATASET ACTIVATE TestData.
DATASET COPY     Wide.
DATASET ACTIVATE Wide      WINDOW=FRONT.

SORT CASES BY Group .
CASESTOVARS
  /ID = Group
  /GROUPBY = VARIABLE
  /COUNT = Members "Size of group" .

LIST.

NUMERIC Mean (F6.2).
COMPUTE Mean = MEAN(Value.1 TO Value.4).

LIST.