Computing the median value of a group of variables

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Computing the median value of a group of variables

Jennifer Thompson
Dear SPSS-ers,

Could anyone tell me if the compute function can be used to work out the
median value of a group of variables?  I can't seem to find the correct
command in the 'compute variables' window.  I have a datafile with just 42
cases but there are 4 sets of 100 variables that represent consecutive
reaction time responses.

Any suggestions will be gratefully received.

Many thanks,

Jennifer Thompson
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
Hi Jennifer

You have to FLIP your dataset, then AGGREGATE the median to a new
file, backflip (or just open again your original file) and MATCH both
files together.

If you need more explanations, send a sample of your data (not the 100
reaction times, just a few) as text file (attachements are quite
restricted in this list, you can't send them as a SAV file) and I'll
work the syntax for you. What version of SPSS are you using?.

JT> Could anyone tell me if the compute function can be used to work out the
JT> median value of a group of variables?  I can't seem to find the correct
JT> command in the 'compute variables' window.  I have a datafile with just 42
JT> cases but there are 4 sets of 100 variables that represent consecutive
JT> reaction time responses.




--
Regards,
Dr. Marta García-Granero,PhD           mailto:[hidden email]
Statistician

---
"It is unwise to use a statistical procedure whose use one does
not understand. SPSS syntax guide cannot supply this knowledge, and it
is certainly no substitute for the basic understanding of statistics
and statistical thinking that is essential for the wise choice of
methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
In reply to this post by Jennifer Thompson
Hi Jennifer

No need to tamper your dataset with FLIPs AGGREGATEs and other nasty
transformations.

I did not remember I had answered this same question some time ago.
You can easily adapt this MATRIX code to your needs (even turning it
to a MACRO with the list of variable names and output variable name as
arguments):

* Sample dataset (only 10 variables instead of 100) *.
DATA LIST LIST/v1 TO v10 (10 F8).
BEGIN DATA
1 1 3 5 1 6 3 7 9 5
2 3 1 5 7 4 9 7 8 3
4 5 3 6 7 8 1 4 3 9
END DATA.

* This variable is needed for correct matching later *.
COMPUTE id=$casenum.

MATRIX.
* Replace "V1 TO V10" by your 100 variables names *.
GET data /VAR=V1 TO v10.
COMPUTE n=NROW(data).
COMPUTE k=NCOL(data).
COMPUTE ranked=MAKE(n,k,0).
COMPUTE sorted=MAKE(n,k,0).
COMPUTE medians=MAKE(n,1,0).
LOOP i=1 TO n.
- COMPUTE ranked(i,:)=GRADE(data(i,:)).
- COMPUTE sorted(i,ranked(i,:))=data(i,:).
- COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2.
END LOOP.
COMPUTE id={T(1:n)}.
COMPUTE namevec={'Medians','id'}.
SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
END MATRIX.

MATCH FILES /FILE=*
 /FILE='C:\Temp\Medians.sav'
 /BY id.
EXE. /* This execute is needed for next command *.
DELETE VARIABLES id.

JT> Could anyone tell me if the compute function can be used to work out the
JT> median value of a group of variables?  I can't seem to find the correct
JT> command in the 'compute variables' window.  I have a datafile with just 42
JT> cases but there are 4 sets of 100 variables that represent consecutive
JT> reaction time responses.


--
Regards,
Dr. Marta García-Granero,PhD           mailto:[hidden email]
Statistician

---
"It is unwise to use a statistical procedure whose use one does
not understand. SPSS syntax guide cannot supply this knowledge, and it
is certainly no substitute for the basic understanding of statistics
and statistical thinking that is essential for the wise choice of
methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Jennifer Thompson
Hi Marta,

Many thanks for your help - I was relieved to find there was a way around
this that didn't involve flipping the dataset.

I tried your code, replacing the line:

GET data /VAR=V1 TO v10.

with:

GET data /VAR=b1_001 TO b1_100.  (as my variables are labelled)

but encountered errors when I ran the code (see below).  Do I need to
further modify your code?  I'm afraid I'm a novice and generally use the
drop down menus, etc rather than coding by hand so I'm probably missing
something very obvious!

Best wishes,

Jennifer


Errors from SPSS output (first)
Run MATRIX procedure:
>Error encountered in source line #    43

>Error # 12555
>During execution of the GET statement, missing value has been encountered,
>but no MISSING subcommand is specified.
>This command not executed.


On 9/27/06, Marta García-Granero <[hidden email]> wrote:

>
> Hi Jennifer
>
> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty
> transformations.
>
> I did not remember I had answered this same question some time ago.
> You can easily adapt this MATRIX code to your needs (even turning it
> to a MACRO with the list of variable names and output variable name as
> arguments):
>
> * Sample dataset (only 10 variables instead of 100) *.
> DATA LIST LIST/v1 TO v10 (10 F8).
> BEGIN DATA
> 1 1 3 5 1 6 3 7 9 5
> 2 3 1 5 7 4 9 7 8 3
> 4 5 3 6 7 8 1 4 3 9
> END DATA.
>
> * This variable is needed for correct matching later *.
> COMPUTE id=$casenum.
>
> MATRIX.
> * Replace "V1 TO V10" by your 100 variables names *.
> GET data /VAR=V1 TO v10.
> COMPUTE n=NROW(data).
> COMPUTE k=NCOL(data).
> COMPUTE ranked=MAKE(n,k,0).
> COMPUTE sorted=MAKE(n,k,0).
> COMPUTE medians=MAKE(n,1,0).
> LOOP i=1 TO n.
> - COMPUTE ranked(i,:)=GRADE(data(i,:)).
> - COMPUTE sorted(i,ranked(i,:))=data(i,:).
> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2.
> END LOOP.
> COMPUTE id={T(1:n)}.
> COMPUTE namevec={'Medians','id'}.
> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
> PRINT /TITLE='Medians have been computed and saved to
> C:\Temp\Medians.sav'.
> END MATRIX.
>
> MATCH FILES /FILE=*
> /FILE='C:\Temp\Medians.sav'
> /BY id.
> EXE. /* This execute is needed for next command *.
> DELETE VARIABLES id.
>
> JT> Could anyone tell me if the compute function can be used to work out
> the
> JT> median value of a group of variables?  I can't seem to find the
> correct
> JT> command in the 'compute variables' window.  I have a datafile with
> just 42
> JT> cases but there are 4 sets of 100 variables that represent consecutive
> JT> reaction time responses.
>
>
> --
> Regards,
> Dr. Marta García-Granero,PhD           mailto:[hidden email]
> Statistician
>
> ---
> "It is unwise to use a statistical procedure whose use one does
> not understand. SPSS syntax guide cannot supply this knowledge, and it
> is certainly no substitute for the basic understanding of statistics
> and statistical thinking that is essential for the wise choice of
> methods and the correct interpretation of their results".
>
> (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
>
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
Hi Jennifer

Ouch! After clicking "Send", I said to myself: "I HOPE she has no
missing data...".

Well, it looks like you have. This MATRIX code is quite simple and
can't take into account missing data, until I modify it quite a lot.
Could you wait until tomorrow? (dinner time here in Spain). I think it
will be quite easy to modify the code to take into account those nasty
missing data. Just one thing I need to know: what it the highest value
that can be found in your dataset? I need it because I'll replace your
missing data with a very high value not reached by any data, in order
to have it a the highest position, then I need to drop the sample size
down according to the number of missing values within each row, and
then modify the median formula to take into account that after
discounting missing data, your efective sample size could be odd
(instead of 100 - even). I believe 1E6 will be enough as user missing
value inside MATRIX.

I promise I will do it tomorrow, I'm always tempted by challenging
problems..

JT> Many thanks for your help - I was relieved to find there was a way around
JT> this that didn't involve flipping the dataset.

JT> I tried your code, replacing the line:

JT> GET data /VAR=V1 TO v10.

JT> with:

JT> GET data /VAR=b1_001 TO b1_100.  (as my variables are labelled)

JT> but encountered errors when I ran the code (see below).  Do I need to
JT> further modify your code?  I'm afraid I'm a novice and generally use the
JT> drop down menus, etc rather than coding by hand so I'm probably missing
JT> something very obvious!


Regards,
Marta
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Richard Ristow
In reply to this post by Jennifer Thompson
At 11:06 AM 9/27/2006, Jennifer Thompson wrote:

>[How to] work out the median value of a group of variables?  I have a
>datafile with just 42 cases but there are 4 sets of 100 variables that
>represent consecutive reaction time responses.

OK if I weigh in? Marta suggested either

>You have to FLIP your dataset, then AGGREGATE the median to a new
>file, backflip (or just open again your original file) and MATCH both
>files together.

or

>No need to tamper with FLIPs AGGREGATEs and other nasty
>transformations. You can easily adapt this MATRIX code to your needs

I haven't looked at the MATRIX code. Marta's done wonders with MATRIX,
more than just about any of us, much more than I have.

But I might, myself, use the first approach, using "wide to long to
wide" logic (my terminology) - in this case, "wide to long to
AGGREGATE." (Hey, Marta - 'nasty' is in the eye of the beholder. You
know what I can do with the transformation language and AGGREGATE: real
power, and quite clean if used properly.)

For "wide to long", VARSTOCASES is almost always cleaner and more
reliable than is FLIP. You'll probably get the MATRIX code working
fine; if not (is this OK, Marta?), give us some test data and I'll give
it a go with VARSTOCASES logic.

-Cheers, and good luck to you,
  Richard
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
In reply to this post by Jennifer Thompson
Hi again:

Ok, dinner waited for 10 minutes. Try this modified code (checked by
flipping the dataset and asking for frequencies command with median).

* Sample dataset *.
PRESERVE.
* Just the avoid the annoying warning concerning those missing data *.
SET ERRORS=NONE.
DATA LIST LIST/v1 TO v10 (10 F8).
BEGIN DATA
1 1 3 5 1 6 3 . 9 5
2 3 1 5 7 4 9 7 8 3
4 5 3 6 . 8 1 4 3 9
END DATA.
RESTORE.
COMPUTE id=$casenum.

* Important step! *.
COUNT nmiss = v1 TO v10 (SYSMIS)  .

MATRIX.
* Replace by your 100 variables names *.
GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6.
GET nmiss /VAR=nmiss.
COMPUTE n=NROW(data).
COMPUTE k=NCOL(data).
COMPUTE validn=k-nmiss.
COMPUTE ranked=MAKE(n,k,0).
COMPUTE sorted=MAKE(n,k,0).
COMPUTE medians=MAKE(n,1,0).
LOOP i=1 TO n.
- COMPUTE ranked(i,:)=GRADE(data(i,:)).
- COMPUTE sorted(i,ranked(i,:))=data(i,:).
- DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes *.
-  COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.
- ELSE. /* Median for odd samples *.
-  COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).
- END IF.
END LOOP.
COMPUTE id={T(1:n)}.
COMPUTE namevec={'Medians','id'}.
SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
END MATRIX.

MATCH FILES /FILE=*
 /FILE='C:\Temp\Medians.sav'
 /BY id.
EXE.
DELETE VARIABLES id nmiss.


Wednesday, September 27, 2006, 8:13:05 PM, You wrote:

JT> Hi Marta,

JT> Many thanks for your help - I was relieved to find there was
JT> a way around this that didn't involve flipping the dataset.

JT> I tried your code, replacing the line:

JT> GET data /VAR=V1 TO v10.

JT> with:

JT> GET data /VAR=b1_001 TO b1_100.  (as my variables are labelled)

JT> but encountered errors when I ran the code (see below).  Do I
JT> need to further modify your code?  I'm afraid I'm a novice and
JT> generally use the drop down menus, etc rather than coding by hand
JT> so I'm probably missing something very obvious!

JT> Best wishes,

JT> Jennifer


JT> Errors from SPSS output (first)
JT> Run MATRIX procedure:
>>Error encountered in source line #    43

>>Error # 12555
>>During execution of the GET statement, missing value has been encountered,
>>but no MISSING subcommand is specified.
>>This command not executed.


JT> On 9/27/06, Marta García-Granero <[hidden email]> wrote:Hi Jennifer

JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty
JT> transformations.

JT> I did not remember I had answered this same question some time ago.
JT> You can easily adapt this MATRIX code to your needs (even turning it
JT> to a MACRO with the list of variable names and output variable name as
JT> arguments):

JT> * Sample dataset (only 10 variables instead of 100) *.
JT> DATA LIST LIST/v1 TO v10 (10 F8).
JT> BEGIN DATA
JT> 1 1 3 5 1 6 3 7 9 5
JT> 2 3 1 5 7 4 9 7 8 3
JT> 4 5 3 6 7 8 1 4 3 9
JT> END DATA.

JT> * This variable is needed for correct matching later *.
JT> COMPUTE id=$casenum.

JT> MATRIX.
JT> * Replace "V1 TO V10" by your 100 variables names *.
JT> GET data /VAR=V1 TO v10.
JT> COMPUTE n=NROW(data).
JT> COMPUTE k=NCOL(data).
JT> COMPUTE ranked=MAKE(n,k,0).
JT> COMPUTE sorted=MAKE(n,k,0).
JT> COMPUTE medians=MAKE(n,1,0).
JT> LOOP i=1 TO n.
JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)).
JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:).
JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2.
JT> END LOOP.
JT> COMPUTE id={T(1:n)}.
JT> COMPUTE namevec={'Medians','id'}.
JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
JT> PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
JT> END MATRIX.

JT> MATCH FILES /FILE=*
JT>  /FILE='C:\Temp\Medians.sav'
JT>  /BY id.
JT> EXE. /* This execute is needed for next command *.
JT> DELETE VARIABLES id.

JT>> Could anyone tell me if the compute function can be used to work out the
JT>> median value of a group of variables?  I can't seem to find the correct
JT>> command in the 'compute variables' window.  I have a datafile with just 42
JT>> cases but there are 4 sets of 100 variables that represent consecutive
JT>> reaction time responses.


JT> --
JT> Regards,
JT> Dr. Marta García-Granero,PhD           mailto:[hidden email]
JT> Statistician

JT> ---
JT> "It is unwise to use a statistical procedure whose use one does
JT> not understand. SPSS syntax guide cannot supply this knowledge, and it
JT> is certainly no substitute for the basic understanding of statistics
JT> and statistical thinking that is essential for the wise choice of
JT> methods and the correct interpretation of their results".

JT> (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)









--
Regards,
Dr. Marta García-Granero,PhD           mailto:[hidden email]
Statistician

---
"It is unwise to use a statistical procedure whose use one does
not understand. SPSS syntax guide cannot supply this knowledge, and it
is certainly no substitute for the basic understanding of statistics
and statistical thinking that is essential for the wise choice of
methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
In reply to this post by Richard Ristow
Hi Richard

RR> At 11:06 AM 9/27/2006, Jennifer Thompson wrote:

>>[How to] work out the median value of a group of variables?  I have a
>>datafile with just 42 cases but there are 4 sets of 100 variables that
>>represent consecutive reaction time responses.

RR> OK if I weigh in?

Sure, make yourself at home



RR> But I might, myself, use the first approach, using "wide to long to
RR> wide" logic (my terminology) - in this case, "wide to long to
RR> AGGREGATE." (Hey, Marta - 'nasty' is in the eye of the beholder.

Novel users would rather have a "black-boc" approach like using a
MATRIX code that leaves the dataset untouched than have to tamper with
the datset integrity.

RR> For "wide to long", VARSTOCASES is almost always cleaner and more
RR> reliable than is FLIP. You'll probably get the MATRIX code working
RR> fine; if not (is this OK, Marta?), give us some test data and I'll give
RR> it a go with VARSTOCASES logic.

Is this closer to what you were suggesting, Richard?:

* Sample dataset *.
PRESERVE.
SET ERRORS=NONE.
DATA LIST LIST/v1 TO v10 (10 F8).
BEGIN DATA
1 1 3 5 1 6 3 . 9 5
2 3 1 5 7 4 9 7 8 3
4 5 3 6 . 8 1 4 3 9
END DATA.
RESTORE.

COMPUTE id=$casenum.

SAVE OUTFILE 'C:\Temp\OriginalFile.sav'.

VARSTOCASES
 /MAKE data FROM v1 TO v10
 /KEEP =  id
 /NULL = KEEP.
SORT CASES BY id .
SPLIT FILE LAYERED BY id .

OMS /SELECT TABLES
 /IF COMMANDS='Frequencies'
     SUBTYPES='Statistics'
 /DESTINATION FORMAT=SAV
              OUTFILE='C:\Temp\Medians.sav'.
FREQUENCIES
  VARIABLES=data
  /FORMAT=NOTABLE
  /STATISTICS=MEDIAN.
OMSEND.

* Let's clean the output dataset *.
GET FILE 'C:\Temp\Medians.sav'
 /DROP=Command_ TO Label_.
SELECT IF Var3="".
NUMERIC id (F8).
COMPUTE id=NUMBER(Var1,'F8').
RENAME VARIABLES (Var5=Medians).
FORMAT Medians (F8.2).
EXE. /* Needed *.
DELETE VARIABLES Var1 TO Var4.

SAVE OUTFILE 'C:\Temp\Medians.sav'.

GET FILE 'C:\Temp\OriginalFile.sav'.

MATCH FILES /FILE=*
 /FILE='C:\Temp\Medians.sav'
 /BY id.
EXE.
DELETE VARIABLES id .


Best regards,
Marta
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Peck, Jon
In reply to this post by Marta García-Granero
With SPSS 15, you have the ability, in essence, to add your own functions to SPSS transformations via the programmability functionality.  There are helper functions that are part of the Bonus Pack for early adopters that will become generally available in November.

A problem like the casewise median of several variables can be solved very easily with this mechanism.

First, here is a little Python function that calculates a median.  Its argument is a list of values.  First it screens out missing values; then it sorts and returns the middle element or the average of the two middle elements if the number of variables is even.

def median(lis):
   lisnomv = [item for item in lis if not item is None]
   lisnomv.sort()
   s = len(lisnomv)
   if s == 0:
     return None
   return (lisnomv[(s-1)/2] + lisnomv[s/2])/2

It would then be used like this, as an example.

begin program.
include spss, trans

<insert the median def here.

t = trans.Tfunction()
t.append(median, "resultvar", "f", [<your list of variables>])
< as many other functions as you like>
t.execute()
end program.

This will loop over the cases and create a new variable that is the median of the variables listed for each case.

Regards,
Jon Peck
SPSS







-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta García-Granero
Sent: Wednesday, September 27, 2006 1:49 PM
To: [hidden email]
Subject: Re: [SPSSX-L] Computing the median value of a group of variables

Hi again:

Ok, dinner waited for 10 minutes. Try this modified code (checked by flipping the dataset and asking for frequencies command with median).

* Sample dataset *.
PRESERVE.
* Just the avoid the annoying warning concerning those missing data *.
SET ERRORS=NONE.
DATA LIST LIST/v1 TO v10 (10 F8).
BEGIN DATA
1 1 3 5 1 6 3 . 9 5
2 3 1 5 7 4 9 7 8 3
4 5 3 6 . 8 1 4 3 9
END DATA.
RESTORE.
COMPUTE id=$casenum.

* Important step! *.
COUNT nmiss = v1 TO v10 (SYSMIS)  .

MATRIX.
* Replace by your 100 variables names *.
GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6.
GET nmiss /VAR=nmiss.
COMPUTE n=NROW(data).
COMPUTE k=NCOL(data).
COMPUTE validn=k-nmiss.
COMPUTE ranked=MAKE(n,k,0).
COMPUTE sorted=MAKE(n,k,0).
COMPUTE medians=MAKE(n,1,0).
LOOP i=1 TO n.
- COMPUTE ranked(i,:)=GRADE(data(i,:)).
- COMPUTE sorted(i,ranked(i,:))=data(i,:).
- DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes *.
-  COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.
- ELSE. /* Median for odd samples *.
-  COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).
- END IF.
END LOOP.
COMPUTE id={T(1:n)}.
COMPUTE namevec={'Medians','id'}.
SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
END MATRIX.

MATCH FILES /FILE=*
 /FILE='C:\Temp\Medians.sav'
 /BY id.
EXE.
DELETE VARIABLES id nmiss.


Wednesday, September 27, 2006, 8:13:05 PM, You wrote:

JT> Hi Marta,

JT> Many thanks for your help - I was relieved to find there was a way
JT> around this that didn't involve flipping the dataset.

JT> I tried your code, replacing the line:

JT> GET data /VAR=V1 TO v10.

JT> with:

JT> GET data /VAR=b1_001 TO b1_100.  (as my variables are labelled)

JT> but encountered errors when I ran the code (see below).  Do I need
JT> to further modify your code?  I'm afraid I'm a novice and generally
JT> use the drop down menus, etc rather than coding by hand so I'm
JT> probably missing something very obvious!

JT> Best wishes,

JT> Jennifer


JT> Errors from SPSS output (first)
JT> Run MATRIX procedure:
>>Error encountered in source line #    43

>>Error # 12555
>>During execution of the GET statement, missing value has been
>>encountered, but no MISSING subcommand is specified.
>>This command not executed.


JT> On 9/27/06, Marta García-Granero <[hidden email]> wrote:Hi
JT> Jennifer

JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty
JT> transformations.

JT> I did not remember I had answered this same question some time ago.
JT> You can easily adapt this MATRIX code to your needs (even turning it
JT> to a MACRO with the list of variable names and output variable name
JT> as
JT> arguments):

JT> * Sample dataset (only 10 variables instead of 100) *.
JT> DATA LIST LIST/v1 TO v10 (10 F8).
JT> BEGIN DATA
JT> 1 1 3 5 1 6 3 7 9 5
JT> 2 3 1 5 7 4 9 7 8 3
JT> 4 5 3 6 7 8 1 4 3 9
JT> END DATA.

JT> * This variable is needed for correct matching later *.
JT> COMPUTE id=$casenum.

JT> MATRIX.
JT> * Replace "V1 TO V10" by your 100 variables names *.
JT> GET data /VAR=V1 TO v10.
JT> COMPUTE n=NROW(data).
JT> COMPUTE k=NCOL(data).
JT> COMPUTE ranked=MAKE(n,k,0).
JT> COMPUTE sorted=MAKE(n,k,0).
JT> COMPUTE medians=MAKE(n,1,0).
JT> LOOP i=1 TO n.
JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)).
JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:).
JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2.
JT> END LOOP.
JT> COMPUTE id={T(1:n)}.
JT> COMPUTE namevec={'Medians','id'}.
JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
JT> PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
JT> END MATRIX.

JT> MATCH FILES /FILE=*
JT>  /FILE='C:\Temp\Medians.sav'
JT>  /BY id.
JT> EXE. /* This execute is needed for next command *.
JT> DELETE VARIABLES id.

JT>> Could anyone tell me if the compute function can be used to work
JT>> out the median value of a group of variables?  I can't seem to find
JT>> the correct command in the 'compute variables' window.  I have a
JT>> datafile with just 42 cases but there are 4 sets of 100 variables
JT>> that represent consecutive reaction time responses.











--
Regards,
Dr. Marta García-Granero,PhD           mailto:[hidden email]
Statistician

---
"It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Jennifer Thompson
In reply to this post by Marta García-Granero
Hi Marta,

I tried the modified MATRIX code (see below) and encountered the following
error:

Run MATRIX procedure:
>Error encountered in source line #   160

>Error # 12354
>Subscript is out of range.
>This command not executed.

The medians.sav file contains median values only for the first 21 cases
(I've not checked these yet).

Any ideas?  I've copied the syntax as I ran it below.  Any number above 5
would be safe to use as a missing variable.

Best wishes,

Jennifer

-------------------------------------------------------------------------------------------------

PRESERVE.
* Just the avoid the annoying warning concerning those missing data *.
SET ERRORS=NONE.
RESTORE.
COMPUTE id=$casenum.

* Important step! *.
COUNT nmiss = b1_001 TO b1_100 (SYSMIS)  .

MATRIX.
* Replace by your 100 variables names *.
GET data /VAR=b1_001 TO b1_100 /MISSING=ACCEPT /SYSMIS=1E6.
GET nmiss /VAR=nmiss.
COMPUTE n=NROW(data).
COMPUTE k=NCOL(data).
COMPUTE validn=k-nmiss.
COMPUTE ranked=MAKE(n,k,0).
COMPUTE sorted=MAKE(n,k,0).
COMPUTE medians=MAKE(n,1,0).
LOOP i=1 TO n.
- COMPUTE ranked(i,:)=GRADE(data(i,:)).
- COMPUTE sorted(i,ranked(i,:))=data(i,:).
- DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes
*.
-  COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.
- ELSE. /* Median for odd samples *.
-  COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).
- END IF.
END LOOP.
COMPUTE id={T(1:n)}.
COMPUTE namevec={'Medians','id'}.
SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
END MATRIX.

MATCH FILES /FILE=*
 /FILE='C:\Temp\Medians.sav'
 /BY id.
EXE.
DELETE VARIABLES id nmiss.

On 9/27/06, Marta García-Granero <[hidden email]> wrote:

>
> Hi again:
>
> Ok, dinner waited for 10 minutes. Try this modified code (checked by
> flipping the dataset and asking for frequencies command with median).
>
> * Sample dataset *.
> PRESERVE.
> * Just the avoid the annoying warning concerning those missing data *.
> SET ERRORS=NONE.
> DATA LIST LIST/v1 TO v10 (10 F8).
> BEGIN DATA
> 1 1 3 5 1 6 3 . 9 5
> 2 3 1 5 7 4 9 7 8 3
> 4 5 3 6 . 8 1 4 3 9
> END DATA.
> RESTORE.
> COMPUTE id=$casenum.
>
> * Important step! *.
> COUNT nmiss = v1 TO v10 (SYSMIS)  .
>
> MATRIX.
> * Replace by your 100 variables names *.
> GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6.
> GET nmiss /VAR=nmiss.
> COMPUTE n=NROW(data).
> COMPUTE k=NCOL(data).
> COMPUTE validn=k-nmiss.
> COMPUTE ranked=MAKE(n,k,0).
> COMPUTE sorted=MAKE(n,k,0).
> COMPUTE medians=MAKE(n,1,0).
> LOOP i=1 TO n.
> - COMPUTE ranked(i,:)=GRADE(data(i,:)).
> - COMPUTE sorted(i,ranked(i,:))=data(i,:).
> - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample
> sizes *.
> -  COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.
> - ELSE. /* Median for odd samples *.
> -  COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).
> - END IF.
> END LOOP.
> COMPUTE id={T(1:n)}.
> COMPUTE namevec={'Medians','id'}.
> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
> PRINT /TITLE='Medians have been computed and saved to
> C:\Temp\Medians.sav'.
> END MATRIX.
>
> MATCH FILES /FILE=*
> /FILE='C:\Temp\Medians.sav'
> /BY id.
> EXE.
> DELETE VARIABLES id nmiss.
>
>
> Wednesday, September 27, 2006, 8:13:05 PM, You wrote:
>
> JT> Hi Marta,
>
> JT> Many thanks for your help - I was relieved to find there was
> JT> a way around this that didn't involve flipping the dataset.
>
> JT> I tried your code, replacing the line:
>
> JT> GET data /VAR=V1 TO v10.
>
> JT> with:
>
> JT> GET data /VAR=b1_001 TO b1_100.  (as my variables are labelled)
>
> JT> but encountered errors when I ran the code (see below).  Do I
> JT> need to further modify your code?  I'm afraid I'm a novice and
> JT> generally use the drop down menus, etc rather than coding by hand
> JT> so I'm probably missing something very obvious!
>
> JT> Best wishes,
>
> JT> Jennifer
>
>
> JT> Errors from SPSS output (first)
> JT> Run MATRIX procedure:
> >>Error encountered in source line #    43
>
> >>Error # 12555
> >>During execution of the GET statement, missing value has been
> encountered,
> >>but no MISSING subcommand is specified.
> >>This command not executed.
>
>
> JT> On 9/27/06, Marta García-Granero <[hidden email]> wrote:Hi
> Jennifer
>
> JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty
> JT> transformations.
>
> JT> I did not remember I had answered this same question some time ago.
> JT> You can easily adapt this MATRIX code to your needs (even turning it
> JT> to a MACRO with the list of variable names and output variable name as
> JT> arguments):
>
> JT> * Sample dataset (only 10 variables instead of 100) *.
> JT> DATA LIST LIST/v1 TO v10 (10 F8).
> JT> BEGIN DATA
> JT> 1 1 3 5 1 6 3 7 9 5
> JT> 2 3 1 5 7 4 9 7 8 3
> JT> 4 5 3 6 7 8 1 4 3 9
> JT> END DATA.
>
> JT> * This variable is needed for correct matching later *.
> JT> COMPUTE id=$casenum.
>
> JT> MATRIX.
> JT> * Replace "V1 TO V10" by your 100 variables names *.
> JT> GET data /VAR=V1 TO v10.
> JT> COMPUTE n=NROW(data).
> JT> COMPUTE k=NCOL(data).
> JT> COMPUTE ranked=MAKE(n,k,0).
> JT> COMPUTE sorted=MAKE(n,k,0).
> JT> COMPUTE medians=MAKE(n,1,0).
> JT> LOOP i=1 TO n.
> JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)).
> JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:).
> JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2.
> JT> END LOOP.
> JT> COMPUTE id={T(1:n)}.
> JT> COMPUTE namevec={'Medians','id'}.
> JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
> JT> PRINT /TITLE='Medians have been computed and saved to
> C:\Temp\Medians.sav'.
> JT> END MATRIX.
>
> JT> MATCH FILES /FILE=*
> JT>  /FILE='C:\Temp\Medians.sav'
> JT>  /BY id.
> JT> EXE. /* This execute is needed for next command *.
> JT> DELETE VARIABLES id.
>
> JT>> Could anyone tell me if the compute function can be used to work out
> the
> JT>> median value of a group of variables?  I can't seem to find the
> correct
> JT>> command in the 'compute variables' window.  I have a datafile with
> just 42
> JT>> cases but there are 4 sets of 100 variables that represent
> consecutive
> JT>> reaction time responses.
>
>
> JT> --
> JT> Regards,
> JT> Dr. Marta García-Granero,PhD           mailto:[hidden email]
> JT> Statistician
>
> JT> ---
> JT> "It is unwise to use a statistical procedure whose use one does
> JT> not understand. SPSS syntax guide cannot supply this knowledge, and it
> JT> is certainly no substitute for the basic understanding of statistics
> JT> and statistical thinking that is essential for the wise choice of
> JT> methods and the correct interpretation of their results".
>
> JT> (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
>
>
>
>
>
>
>
>
>
> --
> Regards,
> Dr. Marta García-Granero,PhD           mailto:[hidden email]
> Statistician
>
> ---
> "It is unwise to use a statistical procedure whose use one does
> not understand. SPSS syntax guide cannot supply this knowledge, and it
> is certainly no substitute for the basic understanding of statistics
> and statistical thinking that is essential for the wise choice of
> methods and the correct interpretation of their results".
>
> (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)
>
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
Hi Jennifer

JT> I tried the modified MATRIX code (see below) and encountered the following
JT> error:

JT> Run MATRIX procedure:
>>Error encountered in source line #   160

>>Error # 12354
>>Subscript is out of range.
>>This command not executed.


I have created a false dataset with a similar layout to the one yours
has (43 cases, 100 variables with scattered missing data):
* Some fake data simulating your data set *.
INPUT PROGRAM.
NUMERIC case vars (f8).
LEAVE ALL.
- LOOP case = 1 TO 42.
-  LOOP vars = 1 TO 100.
-   COMPUTE score=UNIFORM(5).
-   END CASE.
-  END LOOP.
- END LOOP.
END FILE.
END INPUT PROGRAM.
* Now adding some scattered missing data *.
RECODE score (Lowest thru 0.75=SYSMIS)  .
* Reorganize it *.
SORT CASES BY case vars .
CASESTOVARS
 /ID = case
 /INDEX = vars
 /GROUPBY = VARIABLE .
EXECUTE .
DELETE VARIABLES case.
* Fake cases are ready *.

Now, try this modified version of the code (see the MXLOOPS I forgot
to add yesterday, low brain glucose levels were responsible for sure):


* SYNTAX FOR MEDIANS BEGINS HERE *.
COMPUTE id=$casenum.
* Replace score.1 TO score.100 by your own variables *.
COUNT nmiss = score.1 TO score.100 (SYSMIS)  .
PRESERVE.
SET MXLOOPS=500.
MATRIX.
* Replace score.1 TO score.100 by your own variables *.
GET data /VAR=score.1 TO score.100 /MISSING=ACCEPT /SYSMIS=1E3.
GET nmiss /VAR=nmiss.
COMPUTE n=NROW(data).
COMPUTE k=NCOL(data).
COMPUTE validn=k-nmiss.
COMPUTE ranked=MAKE(n,k,0).
COMPUTE sorted=MAKE(n,k,0).
COMPUTE medians=MAKE(n,1,0).
LOOP i=1 TO n.
- COMPUTE ranked(i,:)=GRADE(data(i,:)).
- COMPUTE sorted(i,ranked(i,:))=data(i,:).
- DO IF TRUNC(validn(i)/2) EQ (validn(i)/2).
-  COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.
- ELSE.
-  COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).
- END IF.
END LOOP.
COMPUTE id={T(1:n)}.
COMPUTE namevec={'Medians','id'}.
SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
END MATRIX.
RESTORE.
MATCH FILES /FILE=*
  /FILE='C:\Temp\Medians.sav'
  /BY id.
 EXE.
 DELETE VARIABLES id nmiss.

* END OF SYNTAX *.

I've checked it with the fake data and it works.

Marta
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Jennifer Thompson
Hi again Marta,

Now I'm really confused.  Your code works perfectly as written
(although I had to shorten the dummy variable name for SPSS 11.5), but
as soon as I modify it (see below) and run it on my data I get a
similar error, and the medians are worked out for the first 21 cases
only:

Run MATRIX procedure:

>Error encountered in source line # 27
>Error # 12354
>Subscript is out of range.
>This command not executed.

Perhaps it's a problem with the way my variables are labelled, or with
this version of SPSS (11.5)?  It's clearly not your code that's the
problem anyway.  I might just have to switch to Excel to work these
scores out, much as I hate to admit defeat...

Thanks for all your time and help.
Best wishes,
Jennifer



* SYNTAX FOR MEDIANS BEGINS HERE *.

COMPUTE id=$casenum.

* Replace score.1 TO score.100 by your own variables *.

COUNT nmiss = b1_001 TO b1_100 (SYSMIS) .

PRESERVE.

SET MXLOOPS=500.

MATRIX.

* Replace score.1 TO score.100 by your own variables *.

GET data /VAR=b1_001 TO b1_100 /MISSING=ACCEPT /SYSMIS=1E3.

GET nmiss /VAR=nmiss.

COMPUTE n=NROW(data).

COMPUTE k=NCOL(data).

COMPUTE validn=k-nmiss.

COMPUTE ranked=MAKE(n,k,0).

COMPUTE sorted=MAKE(n,k,0).

COMPUTE medians=MAKE(n,1,0).

LOOP i=1 TO n.

- COMPUTE ranked(i,:)=GRADE(data(i,:)).

- COMPUTE sorted(i,ranked(i,:))=data(i,:).

- DO IF TRUNC(validn(i)/2) EQ (validn(i)/2).

- COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.

- ELSE.

- COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).

- END IF.

END LOOP.

COMPUTE id={T(1:n)}.

COMPUTE namevec={'Medians','id'}.

SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.

PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.

END MATRIX.

RESTORE.

MATCH FILES /FILE=*

/FILE='C:\Temp\Medians.sav'

/BY id.

EXE.

DELETE VARIABLES id nmiss.

* END OF SYNTAX *.

On 9/28/06, Marta García-Granero <[hidden email]> wrote:

> Hi Jennifer
>
> JT> I tried the modified MATRIX code (see below) and encountered the following
> JT> error:
>
> JT> Run MATRIX procedure:
> >>Error encountered in source line #   160
>
> >>Error # 12354
> >>Subscript is out of range.
> >>This command not executed.
>
>
> I have created a false dataset with a similar layout to the one yours
> has (43 cases, 100 variables with scattered missing data):
> * Some fake data simulating your data set *.
> INPUT PROGRAM.
> NUMERIC case vars (f8).
> LEAVE ALL.
> - LOOP case = 1 TO 42.
> -  LOOP vars = 1 TO 100.
> -   COMPUTE score=UNIFORM(5).
> -   END CASE.
> -  END LOOP.
> - END LOOP.
> END FILE.
> END INPUT PROGRAM.
> * Now adding some scattered missing data *.
> RECODE score (Lowest thru 0.75=SYSMIS)  .
> * Reorganize it *.
> SORT CASES BY case vars .
> CASESTOVARS
> /ID = case
> /INDEX = vars
> /GROUPBY = VARIABLE .
> EXECUTE .
> DELETE VARIABLES case.
> * Fake cases are ready *.
>
> Now, try this modified version of the code (see the MXLOOPS I forgot
> to add yesterday, low brain glucose levels were responsible for sure):
>
>
> * SYNTAX FOR MEDIANS BEGINS HERE *.
> COMPUTE id=$casenum.
> * Replace score.1 TO score.100 by your own variables *.
> COUNT nmiss = score.1 TO score.100 (SYSMIS)  .
> PRESERVE.
> SET MXLOOPS=500.
> MATRIX.
> * Replace score.1 TO score.100 by your own variables *.
> GET data /VAR=score.1 TO score.100 /MISSING=ACCEPT /SYSMIS=1E3.
> GET nmiss /VAR=nmiss.
> COMPUTE n=NROW(data).
> COMPUTE k=NCOL(data).
> COMPUTE validn=k-nmiss.
> COMPUTE ranked=MAKE(n,k,0).
> COMPUTE sorted=MAKE(n,k,0).
> COMPUTE medians=MAKE(n,1,0).
> LOOP i=1 TO n.
> - COMPUTE ranked(i,:)=GRADE(data(i,:)).
> - COMPUTE sorted(i,ranked(i,:))=data(i,:).
> - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2).
> -  COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2.
> - ELSE.
> -  COMPUTE medians(i)=sorted(i,(validn(i)+1)/2).
> - END IF.
> END LOOP.
> COMPUTE id={T(1:n)}.
> COMPUTE namevec={'Medians','id'}.
> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec.
> PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'.
> END MATRIX.
> RESTORE.
> MATCH FILES /FILE=*
> /FILE='C:\Temp\Medians.sav'
> /BY id.
> EXE.
> DELETE VARIABLES id nmiss.
>
> * END OF SYNTAX *.
>
> I've checked it with the fake data and it works.
>
> Marta
>
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Richard Ristow
In reply to this post by Marta García-Granero
I worked out the "wide to long" approach, using
only pretty vanilla SPSS - no OMS, even. You
probably don't want to change to this to solve
the practical problem, unless the MATRIX approach
gets badly bogged down; but, to illustrate what
you CAN do with the transformation language:

At 03:25 PM 9/27/2006, Marta García-Granero wrote:

>>RR> I might, myself, use the first approach,
>>using "wide to long to wide" logic (my
>>terminology) - in this case, "wide to long to AGGREGATE."
>>
>>RR> For "wide to long", VARSTOCASES is almost
>>always cleaner and more reliable than is FLIP.
>>Give us some test data and I'll give it a go with VARSTOCASES logic.

Which I did. And had more trouble than I
expected. It took me a while to calculate the
medians correctly; I found that much the toughest
part to get right. If you have MEDIAN function in
AGGREGATE (SPSS 14+) it's much easier.

>Is this closer to what you were suggesting, Richard?:

[Code omitted. Logic uses FREQUENCIES to compute
group medians, OMS to capture the values.]

Below are three solutions. They use test data
with variables besides the ones whose median is
to be computed, and keep those variables. The
first solution doesn't use AGGREGATE function
MEDIAN; it's the only one that will work if you
don't have SPSS 14. The second does use that
function, and is therefore simpler.

They both use MATCH FILES. Rather than using a
scratch file, I've MATCHED with the original
data; that requires the original data be in order by CASEID.

The third solution is a tour de force; it does
*not* use MATCH FILES, but it does some pretty
fancy CASESTOVARS and AGGREGATE. It uses the
MEDIAN function of AGGREGATE. It could probably
be rewritten along the lines of the first, but
the code would get truly convoluted.

Enjoy it all!  -Richard

And, this is tested code; SPSS draft output:
............................................
* ......  Verify test data   .............................. .
NEW FILE.
GET FILE=TESTDATA.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-SEP-2006 10:56:00       |
|-----------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

CaseID TEXT   QUANTITY  v1  v2  v3  v4  v5  v6  v7  v8  v9 v10

C.1    Alpha    12.34    1   1   3   5   1   6   3   .   9   5
C.2    Beta      5.67    2   3   1   5   7   4   9   7   8   3
C.3    Gamma    98.76    4   5   3   6   .   8   1   4   3   9

Number of cases read:  3    Number of cases listed:  3


* ......  Unroll and compute median, using   .............. .
* ......  AGGREGATE without MEDIAN function  .............. .

VARSTOCASES
  /MAKE Value FROM v1 v2 v3 v4 v5 v6 v7 v8 v9 v10
  /KEEP =  CaseID
  /NULL =  Drop
  /COUNT = N_Value "Number of valid values in record" .


Variables to Cases

Notes
|--------------------------|---------------------------|
|Output Created            |28-SEP-2006 10:56:00       |
|--------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

Generated Variables
|-------|---------------|
|Name   |Label          |
|-------|---------------|
|N_Value|Number of valid|
|       |values in      |
|       |record         |
|-------|---------------|
|Value  |<none>         |
|-------|---------------|

Processing Statistics
|-------------|--|
|Variables In |13|
|-------------|--|
|Variables Out|3 |
|-------------|--|


SORT CASES BY CaseID Value.
*  If the number of values is odd, the median is the middle .
*  value: rank (N+1)/2                                      .
*  If the number of cases is even, the median is the mean   .
*  of the two middle values: N/2, N/2 + 1.                  .
*  Or, together: the median is the mean of all values with  .
*  rank between N/2 and N/2 + 1.                            .
*  ........................................................ .
*  Variable RANK is kept, and M_VALUE calculateded distinct .
*  from VALUE, are to allow clearer listing for debugging.  .

NUMERIC RANK(F3).
NUMERIC M_VALUE(F3).

DO IF   MISSING(CASEID).
.  COMPUTE RANK = 1.
ELSE IF CASEID NE LAG(CASEID).
.  COMPUTE RANK = 1.
ELSE.
.  COMPUTE RANK = LAG(RANK) + 1.
END IF.

COMPUTE M_VALUE = VALUE.
IF  (RANK LT ( N_VALUE/2      - .001)) M_VALUE = $SYSMIS.
IF  (RANK GT ((N_VALUE/2 + 1) + .001)) M_VALUE = $SYSMIS.

.  /**/ LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-SEP-2006 10:56:00       |
|-----------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

CaseID N_Value Value RANK M_VALUE

C.1         9     1     1     .
C.1         9     1     2     .
C.1         9     1     3     .
C.1         9     3     4     .
C.1         9     3     5     3
C.1         9     5     6     .
C.1         9     5     7     .
C.1         9     6     8     .
C.1         9     9     9     .
C.2        10     1     1     .
C.2        10     2     2     .
C.2        10     3     3     .
C.2        10     3     4     .
C.2        10     4     5     4
C.2        10     5     6     5
C.2        10     7     7     .
C.2        10     7     8     .
C.2        10     8     9     .
C.2        10     9    10     .
C.3         9     1     1     .
C.3         9     3     2     .
C.3         9     3     3     .
C.3         9     4     4     .
C.3         9     4     5     4
C.3         9     5     6     .
C.3         9     6     7     .
C.3         9     8     8     .
C.3         9     9     9     .

Number of cases read:  28    Number of cases listed:  28


AGGREGATE
   /OUTFILE=*
   /BREAK=CaseID
   /Median 'Median of v1 TO v10' = MEAN(M_Value).

FORMATS MEDIAN (F5.1).

MATCH FILES
    /FILE=TESTDATA
    /FILE=*
    /BY   CASEID
    /KEEP = CASEID Text Quantity Median ALL.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-SEP-2006 10:56:00       |
|-----------------------------|---------------------------|
CaseID TEXT   QUANTITY Median  v1  v2  v3  v4  v5  v6  v7  v8  v9 v10

C.1    Alpha    12.34     3.0   1   1   3   5   1   6   3   .   9   5
C.2    Beta      5.67     4.5   2   3   1   5   7   4   9   7   8   3
C.3    Gamma    98.76     4.0   4   5   3   6   .   8   1   4   3   9


Number of cases read:  3    Number of cases listed:  3


* ......  With AGGREGATE MEDIAN function - SPSS 14+  ...... .
NEW FILE.
GET FILE=TESTDATA.

VARSTOCASES
  /MAKE Value FROM v1 v2 v3 v4 v5 v6 v7 v8 v9 v10
  /KEEP =  CaseID
  /NULL = KEEP.


Variables to Cases
|--------------------------|---------------------------|
|Output Created            |28-SEP-2006 10:56:00       |
|--------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

Generated Variables
|-----|------|
|Name |Label |
|-----|------|
|Value|<none>|
|-----|------|

Processing Statistics
|-------------|--|
|Variables In |13|
|-------------|--|
|Variables Out|2 |
|-------------|--|


AGGREGATE
   /OUTFILE=*
   /BREAK=CaseID
   /Median 'Median of v1 TO v10' = MEDIAN(Value).

FORMATS MEDIAN (F5.1).

MATCH FILES
    /FILE=TESTDATA
    /FILE=*
    /BY   CASEID
    /KEEP = CASEID Text Quantity Median ALL.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-SEP-2006 10:56:01       |
|-----------------------------|---------------------------|
CaseID TEXT   QUANTITY Median  v1  v2  v3  v4  v5  v6  v7  v8  v9 v10

C.1    Alpha    12.34     3.0   1   1   3   5   1   6   3   .   9   5
C.2    Beta      5.67     4.5   2   3   1   5   7   4   9   7   8   3
C.3    Gamma    98.76     4.0   4   5   3   6   .   8   1   4   3   9

Number of cases read:  3    Number of cases listed:  3


* ......  With AGGREGATE MEDIAN function - SPSS 14+  ...... .
* ......  Tour de force: No MATCH FILES              ...... .
NEW FILE.
GET FILE=TESTDATA.

VARSTOCASES  /MAKE Value FROM v1 v2 v3 v4 v5 v6 v7 v8 v9 v10
  /KEEP =  CaseID  TEXT QUANTITY
  /NULL = KEEP.


Variables to Cases
|--------------------------|---------------------------|
|Output Created            |28-SEP-2006 10:56:01       |
|--------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

Generated Variables
|-----|------|
|Name |Label |
|-----|------|
|Value|<none>|
|-----|------|

Processing Statistics
|-------------|--|
|Variables In |13|
|-------------|--|
|Variables Out|4 |
|-------------|--|


AGGREGATE
   /OUTFILE=*   MODE=ADDVARIABLES
   /BREAK=CaseID
   /Median 'Median of v1 TO v10' = MEDIAN(Value).

FORMATS MEDIAN (F5.1).


CASESTOVARS
  /ID = CaseID
  /GROUPBY   = VARIABLE
  /AUTOFIX   = YES
  /RENAME      Value=V
  /SEPARATOR = "".

Cases to Variables
|--------------------------|---------------------------|
|Output Created            |28-SEP-2006 10:56:01       |
|--------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

Generated Variables
|----------|------|
|Original  |Result|
|Variabl   |------|
|e         |Name  |
|-------|--|------|
|Value  |1 |V1    |
|       |2 |V2    |
|       |3 |V3    |
|       |4 |V4    |
|       |5 |V5    |
|       |6 |V6    |
|       |7 |V7    |
|       |8 |V8    |
|       |9 |V9    |
|       |10|V10   |
|-------|--|------|

Processing Statistics
|---------------|----|
|Cases In       |30  |
|Cases Out      |3   |
|---------------|----|
|Cases In/Cases |10.0|
|Out            |    |
|---------------|----|
|Variables In   |5   |
|Variables Out  |14  |
|---------------|----|
|Index Values   |10  |
|---------------|----|


LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-SEP-2006 10:56:21       |
|-----------------------------|---------------------------|
C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
   \2006-09-27 García-Granero - Computing the
median value of a group of variables.SAV

CaseID TEXT   QUANTITY Median  V1  V2  V3  V4  V5  V6  V7  V8  V9 V10

C.1    Alpha    12.34     3.0   1   1   3   5   1   6   3   .   9   5
C.2    Beta      5.67     4.5   2   3   1   5   7   4   9   7   8   3
C.3    Gamma    98.76     4.0   4   5   3   6   .   8   1   4   3   9

Number of cases read:  3    Number of cases listed:  3
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
Hi Richard

RR> Which I did. And had more trouble than I
RR> expected. It took me a while to calculate the
RR> medians correctly; I found that much the toughest
RR> part to get right. If you have MEDIAN function in
RR> AGGREGATE (SPSS 14+) it's much easier.

I'm using SPSS 13 and the MEDIAN function in AGGREGATE is already
there. I think it will be the better solution for Jennifer's problem,
although perhaps the problemas that make my MATRIX code crash could
make your follow the same path...

Warmenst regards,
Marta
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Marta García-Granero
In reply to this post by Jennifer Thompson
Hi Jennifer

This is mystifying...

A couple of questions before you abandon SPSS and go to Excel:

1) Are the 100 variables consecutive in the dataset, with no
extraneous variables between? They keyword "TO" needs that the
variables are truly consecutive.

2) Are your variables numeric, or are they strings with numeric
content? MATRIX has a hard time handling strings

Perhaps if you send me privately a sample of your data, as they are
right now, as a sav file, I can run my code (or Richard's excellent
solution not involving AGGREGATE, since your version 11.5 doesn't have
MEDIAN as a function) and see what happens with your data

JT> Now I'm really confused.  Your code works perfectly as written
JT> (although I had to shorten the dummy variable name for SPSS 11.5),
JT> but as soon as I modify it (see below) and run it on my data I get
JT> a similar error, and the medians are worked out for the first 21
JT> cases only:

JT> Perhaps it's a problem with the way my variables are labelled, or
JT> with this version of SPSS (11.5)?  It's clearly not your code
JT> that's the problem anyway.  I might just have to switch to Excel
JT> to work these scores out, much as I hate to admit defeat...

Labelling is absolutely ignored by MATRIX.

I hope we finally find a solution to your problem.

Marta
Reply | Threaded
Open this post in threaded view
|

Re: Computing the median value of a group of variables

Jennifer Thompson
Hi Marta,

I think the mystery has been solved.  One of my cases has completely
missing data for the the first set of 100 variables.  Once I'd deleted
that case the matrix code worked perfectly.  So, unsurprisingly, *I*
was the problem and feel suitably foolish.  I'd already tried running
your previous version of the code without that dodgy case yesterday
and it made no difference, so I didn't immediately think to try
without it earlier.

Thanks so much for your help.  I'm also going to try and work through
Richard Ristow's WideToLongToWide method for future reference.

Very best wishes,

Jennifer



On 9/28/06, Marta García-Granero <[hidden email]> wrote:

> Hi Jennifer
>
> This is mystifying...
>
> A couple of questions before you abandon SPSS and go to Excel:
>
> 1) Are the 100 variables consecutive in the dataset, with no
> extraneous variables between? They keyword "TO" needs that the
> variables are truly consecutive.
>
> 2) Are your variables numeric, or are they strings with numeric
> content? MATRIX has a hard time handling strings
>
> Perhaps if you send me privately a sample of your data, as they are
> right now, as a sav file, I can run my code (or Richard's excellent
> solution not involving AGGREGATE, since your version 11.5 doesn't have
> MEDIAN as a function) and see what happens with your data
>
> JT> Now I'm really confused.  Your code works perfectly as written
> JT> (although I had to shorten the dummy variable name for SPSS 11.5),
> JT> but as soon as I modify it (see below) and run it on my data I get
> JT> a similar error, and the medians are worked out for the first 21
> JT> cases only:
>
> JT> Perhaps it's a problem with the way my variables are labelled, or
> JT> with this version of SPSS (11.5)?  It's clearly not your code
> JT> that's the problem anyway.  I might just have to switch to Excel
> JT> to work these scores out, much as I hate to admit defeat...
>
> Labelling is absolutely ignored by MATRIX.
>
> I hope we finally find a solution to your problem.
>
> Marta
>