Dear SPSS-ers,
Could anyone tell me if the compute function can be used to work out the median value of a group of variables? I can't seem to find the correct command in the 'compute variables' window. I have a datafile with just 42 cases but there are 4 sets of 100 variables that represent consecutive reaction time responses. Any suggestions will be gratefully received. Many thanks, Jennifer Thompson |
Hi Jennifer
You have to FLIP your dataset, then AGGREGATE the median to a new file, backflip (or just open again your original file) and MATCH both files together. If you need more explanations, send a sample of your data (not the 100 reaction times, just a few) as text file (attachements are quite restricted in this list, you can't send them as a SAV file) and I'll work the syntax for you. What version of SPSS are you using?. JT> Could anyone tell me if the compute function can be used to work out the JT> median value of a group of variables? I can't seem to find the correct JT> command in the 'compute variables' window. I have a datafile with just 42 JT> cases but there are 4 sets of 100 variables that represent consecutive JT> reaction time responses. -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
In reply to this post by Jennifer Thompson
Hi Jennifer
No need to tamper your dataset with FLIPs AGGREGATEs and other nasty transformations. I did not remember I had answered this same question some time ago. You can easily adapt this MATRIX code to your needs (even turning it to a MACRO with the list of variable names and output variable name as arguments): * Sample dataset (only 10 variables instead of 100) *. DATA LIST LIST/v1 TO v10 (10 F8). BEGIN DATA 1 1 3 5 1 6 3 7 9 5 2 3 1 5 7 4 9 7 8 3 4 5 3 6 7 8 1 4 3 9 END DATA. * This variable is needed for correct matching later *. COMPUTE id=$casenum. MATRIX. * Replace "V1 TO V10" by your 100 variables names *. GET data /VAR=V1 TO v10. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. /* This execute is needed for next command *. DELETE VARIABLES id. JT> Could anyone tell me if the compute function can be used to work out the JT> median value of a group of variables? I can't seem to find the correct JT> command in the 'compute variables' window. I have a datafile with just 42 JT> cases but there are 4 sets of 100 variables that represent consecutive JT> reaction time responses. -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
Hi Marta,
Many thanks for your help - I was relieved to find there was a way around this that didn't involve flipping the dataset. I tried your code, replacing the line: GET data /VAR=V1 TO v10. with: GET data /VAR=b1_001 TO b1_100. (as my variables are labelled) but encountered errors when I ran the code (see below). Do I need to further modify your code? I'm afraid I'm a novice and generally use the drop down menus, etc rather than coding by hand so I'm probably missing something very obvious! Best wishes, Jennifer Errors from SPSS output (first) Run MATRIX procedure: >Error encountered in source line # 43 >Error # 12555 >During execution of the GET statement, missing value has been encountered, >but no MISSING subcommand is specified. >This command not executed. On 9/27/06, Marta García-Granero <[hidden email]> wrote: > > Hi Jennifer > > No need to tamper your dataset with FLIPs AGGREGATEs and other nasty > transformations. > > I did not remember I had answered this same question some time ago. > You can easily adapt this MATRIX code to your needs (even turning it > to a MACRO with the list of variable names and output variable name as > arguments): > > * Sample dataset (only 10 variables instead of 100) *. > DATA LIST LIST/v1 TO v10 (10 F8). > BEGIN DATA > 1 1 3 5 1 6 3 7 9 5 > 2 3 1 5 7 4 9 7 8 3 > 4 5 3 6 7 8 1 4 3 9 > END DATA. > > * This variable is needed for correct matching later *. > COMPUTE id=$casenum. > > MATRIX. > * Replace "V1 TO V10" by your 100 variables names *. > GET data /VAR=V1 TO v10. > COMPUTE n=NROW(data). > COMPUTE k=NCOL(data). > COMPUTE ranked=MAKE(n,k,0). > COMPUTE sorted=MAKE(n,k,0). > COMPUTE medians=MAKE(n,1,0). > LOOP i=1 TO n. > - COMPUTE ranked(i,:)=GRADE(data(i,:)). > - COMPUTE sorted(i,ranked(i,:))=data(i,:). > - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2. > END LOOP. > COMPUTE id={T(1:n)}. > COMPUTE namevec={'Medians','id'}. > SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. > PRINT /TITLE='Medians have been computed and saved to > C:\Temp\Medians.sav'. > END MATRIX. > > MATCH FILES /FILE=* > /FILE='C:\Temp\Medians.sav' > /BY id. > EXE. /* This execute is needed for next command *. > DELETE VARIABLES id. > > JT> Could anyone tell me if the compute function can be used to work out > the > JT> median value of a group of variables? I can't seem to find the > correct > JT> command in the 'compute variables' window. I have a datafile with > just 42 > JT> cases but there are 4 sets of 100 variables that represent consecutive > JT> reaction time responses. > > > -- > Regards, > Dr. Marta García-Granero,PhD mailto:[hidden email] > Statistician > > --- > "It is unwise to use a statistical procedure whose use one does > not understand. SPSS syntax guide cannot supply this knowledge, and it > is certainly no substitute for the basic understanding of statistics > and statistical thinking that is essential for the wise choice of > methods and the correct interpretation of their results". > > (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) > |
Hi Jennifer
Ouch! After clicking "Send", I said to myself: "I HOPE she has no missing data...". Well, it looks like you have. This MATRIX code is quite simple and can't take into account missing data, until I modify it quite a lot. Could you wait until tomorrow? (dinner time here in Spain). I think it will be quite easy to modify the code to take into account those nasty missing data. Just one thing I need to know: what it the highest value that can be found in your dataset? I need it because I'll replace your missing data with a very high value not reached by any data, in order to have it a the highest position, then I need to drop the sample size down according to the number of missing values within each row, and then modify the median formula to take into account that after discounting missing data, your efective sample size could be odd (instead of 100 - even). I believe 1E6 will be enough as user missing value inside MATRIX. I promise I will do it tomorrow, I'm always tempted by challenging problems.. JT> Many thanks for your help - I was relieved to find there was a way around JT> this that didn't involve flipping the dataset. JT> I tried your code, replacing the line: JT> GET data /VAR=V1 TO v10. JT> with: JT> GET data /VAR=b1_001 TO b1_100. (as my variables are labelled) JT> but encountered errors when I ran the code (see below). Do I need to JT> further modify your code? I'm afraid I'm a novice and generally use the JT> drop down menus, etc rather than coding by hand so I'm probably missing JT> something very obvious! Regards, Marta |
In reply to this post by Jennifer Thompson
At 11:06 AM 9/27/2006, Jennifer Thompson wrote:
>[How to] work out the median value of a group of variables? I have a >datafile with just 42 cases but there are 4 sets of 100 variables that >represent consecutive reaction time responses. OK if I weigh in? Marta suggested either >You have to FLIP your dataset, then AGGREGATE the median to a new >file, backflip (or just open again your original file) and MATCH both >files together. or >No need to tamper with FLIPs AGGREGATEs and other nasty >transformations. You can easily adapt this MATRIX code to your needs I haven't looked at the MATRIX code. Marta's done wonders with MATRIX, more than just about any of us, much more than I have. But I might, myself, use the first approach, using "wide to long to wide" logic (my terminology) - in this case, "wide to long to AGGREGATE." (Hey, Marta - 'nasty' is in the eye of the beholder. You know what I can do with the transformation language and AGGREGATE: real power, and quite clean if used properly.) For "wide to long", VARSTOCASES is almost always cleaner and more reliable than is FLIP. You'll probably get the MATRIX code working fine; if not (is this OK, Marta?), give us some test data and I'll give it a go with VARSTOCASES logic. -Cheers, and good luck to you, Richard |
In reply to this post by Jennifer Thompson
Hi again:
Ok, dinner waited for 10 minutes. Try this modified code (checked by flipping the dataset and asking for frequencies command with median). * Sample dataset *. PRESERVE. * Just the avoid the annoying warning concerning those missing data *. SET ERRORS=NONE. DATA LIST LIST/v1 TO v10 (10 F8). BEGIN DATA 1 1 3 5 1 6 3 . 9 5 2 3 1 5 7 4 9 7 8 3 4 5 3 6 . 8 1 4 3 9 END DATA. RESTORE. COMPUTE id=$casenum. * Important step! *. COUNT nmiss = v1 TO v10 (SYSMIS) . MATRIX. * Replace by your 100 variables names *. GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6. GET nmiss /VAR=nmiss. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE validn=k-nmiss. COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes *. - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. - ELSE. /* Median for odd samples *. - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). - END IF. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id nmiss. Wednesday, September 27, 2006, 8:13:05 PM, You wrote: JT> Hi Marta, JT> Many thanks for your help - I was relieved to find there was JT> a way around this that didn't involve flipping the dataset. JT> I tried your code, replacing the line: JT> GET data /VAR=V1 TO v10. JT> with: JT> GET data /VAR=b1_001 TO b1_100. (as my variables are labelled) JT> but encountered errors when I ran the code (see below). Do I JT> need to further modify your code? I'm afraid I'm a novice and JT> generally use the drop down menus, etc rather than coding by hand JT> so I'm probably missing something very obvious! JT> Best wishes, JT> Jennifer JT> Errors from SPSS output (first) JT> Run MATRIX procedure: >>Error encountered in source line # 43 >>Error # 12555 >>During execution of the GET statement, missing value has been encountered, >>but no MISSING subcommand is specified. >>This command not executed. JT> On 9/27/06, Marta García-Granero <[hidden email]> wrote:Hi Jennifer JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty JT> transformations. JT> I did not remember I had answered this same question some time ago. JT> You can easily adapt this MATRIX code to your needs (even turning it JT> to a MACRO with the list of variable names and output variable name as JT> arguments): JT> * Sample dataset (only 10 variables instead of 100) *. JT> DATA LIST LIST/v1 TO v10 (10 F8). JT> BEGIN DATA JT> 1 1 3 5 1 6 3 7 9 5 JT> 2 3 1 5 7 4 9 7 8 3 JT> 4 5 3 6 7 8 1 4 3 9 JT> END DATA. JT> * This variable is needed for correct matching later *. JT> COMPUTE id=$casenum. JT> MATRIX. JT> * Replace "V1 TO V10" by your 100 variables names *. JT> GET data /VAR=V1 TO v10. JT> COMPUTE n=NROW(data). JT> COMPUTE k=NCOL(data). JT> COMPUTE ranked=MAKE(n,k,0). JT> COMPUTE sorted=MAKE(n,k,0). JT> COMPUTE medians=MAKE(n,1,0). JT> LOOP i=1 TO n. JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)). JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:). JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2. JT> END LOOP. JT> COMPUTE id={T(1:n)}. JT> COMPUTE namevec={'Medians','id'}. JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. JT> PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. JT> END MATRIX. JT> MATCH FILES /FILE=* JT> /FILE='C:\Temp\Medians.sav' JT> /BY id. JT> EXE. /* This execute is needed for next command *. JT> DELETE VARIABLES id. JT>> Could anyone tell me if the compute function can be used to work out the JT>> median value of a group of variables? I can't seem to find the correct JT>> command in the 'compute variables' window. I have a datafile with just 42 JT>> cases but there are 4 sets of 100 variables that represent consecutive JT>> reaction time responses. JT> -- JT> Regards, JT> Dr. Marta García-Granero,PhD mailto:[hidden email] JT> Statistician JT> --- JT> "It is unwise to use a statistical procedure whose use one does JT> not understand. SPSS syntax guide cannot supply this knowledge, and it JT> is certainly no substitute for the basic understanding of statistics JT> and statistical thinking that is essential for the wise choice of JT> methods and the correct interpretation of their results". JT> (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
In reply to this post by Richard Ristow
Hi Richard
RR> At 11:06 AM 9/27/2006, Jennifer Thompson wrote: >>[How to] work out the median value of a group of variables? I have a >>datafile with just 42 cases but there are 4 sets of 100 variables that >>represent consecutive reaction time responses. RR> OK if I weigh in? Sure, make yourself at home RR> But I might, myself, use the first approach, using "wide to long to RR> wide" logic (my terminology) - in this case, "wide to long to RR> AGGREGATE." (Hey, Marta - 'nasty' is in the eye of the beholder. Novel users would rather have a "black-boc" approach like using a MATRIX code that leaves the dataset untouched than have to tamper with the datset integrity. RR> For "wide to long", VARSTOCASES is almost always cleaner and more RR> reliable than is FLIP. You'll probably get the MATRIX code working RR> fine; if not (is this OK, Marta?), give us some test data and I'll give RR> it a go with VARSTOCASES logic. Is this closer to what you were suggesting, Richard?: * Sample dataset *. PRESERVE. SET ERRORS=NONE. DATA LIST LIST/v1 TO v10 (10 F8). BEGIN DATA 1 1 3 5 1 6 3 . 9 5 2 3 1 5 7 4 9 7 8 3 4 5 3 6 . 8 1 4 3 9 END DATA. RESTORE. COMPUTE id=$casenum. SAVE OUTFILE 'C:\Temp\OriginalFile.sav'. VARSTOCASES /MAKE data FROM v1 TO v10 /KEEP = id /NULL = KEEP. SORT CASES BY id . SPLIT FILE LAYERED BY id . OMS /SELECT TABLES /IF COMMANDS='Frequencies' SUBTYPES='Statistics' /DESTINATION FORMAT=SAV OUTFILE='C:\Temp\Medians.sav'. FREQUENCIES VARIABLES=data /FORMAT=NOTABLE /STATISTICS=MEDIAN. OMSEND. * Let's clean the output dataset *. GET FILE 'C:\Temp\Medians.sav' /DROP=Command_ TO Label_. SELECT IF Var3="". NUMERIC id (F8). COMPUTE id=NUMBER(Var1,'F8'). RENAME VARIABLES (Var5=Medians). FORMAT Medians (F8.2). EXE. /* Needed *. DELETE VARIABLES Var1 TO Var4. SAVE OUTFILE 'C:\Temp\Medians.sav'. GET FILE 'C:\Temp\OriginalFile.sav'. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id . Best regards, Marta |
In reply to this post by Marta García-Granero
With SPSS 15, you have the ability, in essence, to add your own functions to SPSS transformations via the programmability functionality. There are helper functions that are part of the Bonus Pack for early adopters that will become generally available in November.
A problem like the casewise median of several variables can be solved very easily with this mechanism. First, here is a little Python function that calculates a median. Its argument is a list of values. First it screens out missing values; then it sorts and returns the middle element or the average of the two middle elements if the number of variables is even. def median(lis): lisnomv = [item for item in lis if not item is None] lisnomv.sort() s = len(lisnomv) if s == 0: return None return (lisnomv[(s-1)/2] + lisnomv[s/2])/2 It would then be used like this, as an example. begin program. include spss, trans <insert the median def here. t = trans.Tfunction() t.append(median, "resultvar", "f", [<your list of variables>]) < as many other functions as you like> t.execute() end program. This will loop over the cases and create a new variable that is the median of the variables listed for each case. Regards, Jon Peck SPSS -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta García-Granero Sent: Wednesday, September 27, 2006 1:49 PM To: [hidden email] Subject: Re: [SPSSX-L] Computing the median value of a group of variables Hi again: Ok, dinner waited for 10 minutes. Try this modified code (checked by flipping the dataset and asking for frequencies command with median). * Sample dataset *. PRESERVE. * Just the avoid the annoying warning concerning those missing data *. SET ERRORS=NONE. DATA LIST LIST/v1 TO v10 (10 F8). BEGIN DATA 1 1 3 5 1 6 3 . 9 5 2 3 1 5 7 4 9 7 8 3 4 5 3 6 . 8 1 4 3 9 END DATA. RESTORE. COMPUTE id=$casenum. * Important step! *. COUNT nmiss = v1 TO v10 (SYSMIS) . MATRIX. * Replace by your 100 variables names *. GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6. GET nmiss /VAR=nmiss. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE validn=k-nmiss. COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes *. - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. - ELSE. /* Median for odd samples *. - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). - END IF. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id nmiss. Wednesday, September 27, 2006, 8:13:05 PM, You wrote: JT> Hi Marta, JT> Many thanks for your help - I was relieved to find there was a way JT> around this that didn't involve flipping the dataset. JT> I tried your code, replacing the line: JT> GET data /VAR=V1 TO v10. JT> with: JT> GET data /VAR=b1_001 TO b1_100. (as my variables are labelled) JT> but encountered errors when I ran the code (see below). Do I need JT> to further modify your code? I'm afraid I'm a novice and generally JT> use the drop down menus, etc rather than coding by hand so I'm JT> probably missing something very obvious! JT> Best wishes, JT> Jennifer JT> Errors from SPSS output (first) JT> Run MATRIX procedure: >>Error encountered in source line # 43 >>Error # 12555 >>During execution of the GET statement, missing value has been >>encountered, but no MISSING subcommand is specified. >>This command not executed. JT> On 9/27/06, Marta García-Granero <[hidden email]> wrote:Hi JT> Jennifer JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty JT> transformations. JT> I did not remember I had answered this same question some time ago. JT> You can easily adapt this MATRIX code to your needs (even turning it JT> to a MACRO with the list of variable names and output variable name JT> as JT> arguments): JT> * Sample dataset (only 10 variables instead of 100) *. JT> DATA LIST LIST/v1 TO v10 (10 F8). JT> BEGIN DATA JT> 1 1 3 5 1 6 3 7 9 5 JT> 2 3 1 5 7 4 9 7 8 3 JT> 4 5 3 6 7 8 1 4 3 9 JT> END DATA. JT> * This variable is needed for correct matching later *. JT> COMPUTE id=$casenum. JT> MATRIX. JT> * Replace "V1 TO V10" by your 100 variables names *. JT> GET data /VAR=V1 TO v10. JT> COMPUTE n=NROW(data). JT> COMPUTE k=NCOL(data). JT> COMPUTE ranked=MAKE(n,k,0). JT> COMPUTE sorted=MAKE(n,k,0). JT> COMPUTE medians=MAKE(n,1,0). JT> LOOP i=1 TO n. JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)). JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:). JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2. JT> END LOOP. JT> COMPUTE id={T(1:n)}. JT> COMPUTE namevec={'Medians','id'}. JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. JT> PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. JT> END MATRIX. JT> MATCH FILES /FILE=* JT> /FILE='C:\Temp\Medians.sav' JT> /BY id. JT> EXE. /* This execute is needed for next command *. JT> DELETE VARIABLES id. JT>> Could anyone tell me if the compute function can be used to work JT>> out the median value of a group of variables? I can't seem to find JT>> the correct command in the 'compute variables' window. I have a JT>> datafile with just 42 cases but there are 4 sets of 100 variables JT>> that represent consecutive reaction time responses. -- Regards, Dr. Marta García-Granero,PhD mailto:[hidden email] Statistician --- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results". (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) |
In reply to this post by Marta García-Granero
Hi Marta,
I tried the modified MATRIX code (see below) and encountered the following error: Run MATRIX procedure: >Error encountered in source line # 160 >Error # 12354 >Subscript is out of range. >This command not executed. The medians.sav file contains median values only for the first 21 cases (I've not checked these yet). Any ideas? I've copied the syntax as I ran it below. Any number above 5 would be safe to use as a missing variable. Best wishes, Jennifer ------------------------------------------------------------------------------------------------- PRESERVE. * Just the avoid the annoying warning concerning those missing data *. SET ERRORS=NONE. RESTORE. COMPUTE id=$casenum. * Important step! *. COUNT nmiss = b1_001 TO b1_100 (SYSMIS) . MATRIX. * Replace by your 100 variables names *. GET data /VAR=b1_001 TO b1_100 /MISSING=ACCEPT /SYSMIS=1E6. GET nmiss /VAR=nmiss. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE validn=k-nmiss. COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes *. - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. - ELSE. /* Median for odd samples *. - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). - END IF. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id nmiss. On 9/27/06, Marta García-Granero <[hidden email]> wrote: > > Hi again: > > Ok, dinner waited for 10 minutes. Try this modified code (checked by > flipping the dataset and asking for frequencies command with median). > > * Sample dataset *. > PRESERVE. > * Just the avoid the annoying warning concerning those missing data *. > SET ERRORS=NONE. > DATA LIST LIST/v1 TO v10 (10 F8). > BEGIN DATA > 1 1 3 5 1 6 3 . 9 5 > 2 3 1 5 7 4 9 7 8 3 > 4 5 3 6 . 8 1 4 3 9 > END DATA. > RESTORE. > COMPUTE id=$casenum. > > * Important step! *. > COUNT nmiss = v1 TO v10 (SYSMIS) . > > MATRIX. > * Replace by your 100 variables names *. > GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6. > GET nmiss /VAR=nmiss. > COMPUTE n=NROW(data). > COMPUTE k=NCOL(data). > COMPUTE validn=k-nmiss. > COMPUTE ranked=MAKE(n,k,0). > COMPUTE sorted=MAKE(n,k,0). > COMPUTE medians=MAKE(n,1,0). > LOOP i=1 TO n. > - COMPUTE ranked(i,:)=GRADE(data(i,:)). > - COMPUTE sorted(i,ranked(i,:))=data(i,:). > - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample > sizes *. > - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. > - ELSE. /* Median for odd samples *. > - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). > - END IF. > END LOOP. > COMPUTE id={T(1:n)}. > COMPUTE namevec={'Medians','id'}. > SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. > PRINT /TITLE='Medians have been computed and saved to > C:\Temp\Medians.sav'. > END MATRIX. > > MATCH FILES /FILE=* > /FILE='C:\Temp\Medians.sav' > /BY id. > EXE. > DELETE VARIABLES id nmiss. > > > Wednesday, September 27, 2006, 8:13:05 PM, You wrote: > > JT> Hi Marta, > > JT> Many thanks for your help - I was relieved to find there was > JT> a way around this that didn't involve flipping the dataset. > > JT> I tried your code, replacing the line: > > JT> GET data /VAR=V1 TO v10. > > JT> with: > > JT> GET data /VAR=b1_001 TO b1_100. (as my variables are labelled) > > JT> but encountered errors when I ran the code (see below). Do I > JT> need to further modify your code? I'm afraid I'm a novice and > JT> generally use the drop down menus, etc rather than coding by hand > JT> so I'm probably missing something very obvious! > > JT> Best wishes, > > JT> Jennifer > > > JT> Errors from SPSS output (first) > JT> Run MATRIX procedure: > >>Error encountered in source line # 43 > > >>Error # 12555 > >>During execution of the GET statement, missing value has been > encountered, > >>but no MISSING subcommand is specified. > >>This command not executed. > > > JT> On 9/27/06, Marta García-Granero <[hidden email]> wrote:Hi > Jennifer > > JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty > JT> transformations. > > JT> I did not remember I had answered this same question some time ago. > JT> You can easily adapt this MATRIX code to your needs (even turning it > JT> to a MACRO with the list of variable names and output variable name as > JT> arguments): > > JT> * Sample dataset (only 10 variables instead of 100) *. > JT> DATA LIST LIST/v1 TO v10 (10 F8). > JT> BEGIN DATA > JT> 1 1 3 5 1 6 3 7 9 5 > JT> 2 3 1 5 7 4 9 7 8 3 > JT> 4 5 3 6 7 8 1 4 3 9 > JT> END DATA. > > JT> * This variable is needed for correct matching later *. > JT> COMPUTE id=$casenum. > > JT> MATRIX. > JT> * Replace "V1 TO V10" by your 100 variables names *. > JT> GET data /VAR=V1 TO v10. > JT> COMPUTE n=NROW(data). > JT> COMPUTE k=NCOL(data). > JT> COMPUTE ranked=MAKE(n,k,0). > JT> COMPUTE sorted=MAKE(n,k,0). > JT> COMPUTE medians=MAKE(n,1,0). > JT> LOOP i=1 TO n. > JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)). > JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:). > JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2. > JT> END LOOP. > JT> COMPUTE id={T(1:n)}. > JT> COMPUTE namevec={'Medians','id'}. > JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. > JT> PRINT /TITLE='Medians have been computed and saved to > C:\Temp\Medians.sav'. > JT> END MATRIX. > > JT> MATCH FILES /FILE=* > JT> /FILE='C:\Temp\Medians.sav' > JT> /BY id. > JT> EXE. /* This execute is needed for next command *. > JT> DELETE VARIABLES id. > > JT>> Could anyone tell me if the compute function can be used to work out > the > JT>> median value of a group of variables? I can't seem to find the > correct > JT>> command in the 'compute variables' window. I have a datafile with > just 42 > JT>> cases but there are 4 sets of 100 variables that represent > consecutive > JT>> reaction time responses. > > > JT> -- > JT> Regards, > JT> Dr. Marta García-Granero,PhD mailto:[hidden email] > JT> Statistician > > JT> --- > JT> "It is unwise to use a statistical procedure whose use one does > JT> not understand. SPSS syntax guide cannot supply this knowledge, and it > JT> is certainly no substitute for the basic understanding of statistics > JT> and statistical thinking that is essential for the wise choice of > JT> methods and the correct interpretation of their results". > > JT> (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) > > > > > > > > > > -- > Regards, > Dr. Marta García-Granero,PhD mailto:[hidden email] > Statistician > > --- > "It is unwise to use a statistical procedure whose use one does > not understand. SPSS syntax guide cannot supply this knowledge, and it > is certainly no substitute for the basic understanding of statistics > and statistical thinking that is essential for the wise choice of > methods and the correct interpretation of their results". > > (Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind) > |
Hi Jennifer
JT> I tried the modified MATRIX code (see below) and encountered the following JT> error: JT> Run MATRIX procedure: >>Error encountered in source line # 160 >>Error # 12354 >>Subscript is out of range. >>This command not executed. I have created a false dataset with a similar layout to the one yours has (43 cases, 100 variables with scattered missing data): * Some fake data simulating your data set *. INPUT PROGRAM. NUMERIC case vars (f8). LEAVE ALL. - LOOP case = 1 TO 42. - LOOP vars = 1 TO 100. - COMPUTE score=UNIFORM(5). - END CASE. - END LOOP. - END LOOP. END FILE. END INPUT PROGRAM. * Now adding some scattered missing data *. RECODE score (Lowest thru 0.75=SYSMIS) . * Reorganize it *. SORT CASES BY case vars . CASESTOVARS /ID = case /INDEX = vars /GROUPBY = VARIABLE . EXECUTE . DELETE VARIABLES case. * Fake cases are ready *. Now, try this modified version of the code (see the MXLOOPS I forgot to add yesterday, low brain glucose levels were responsible for sure): * SYNTAX FOR MEDIANS BEGINS HERE *. COMPUTE id=$casenum. * Replace score.1 TO score.100 by your own variables *. COUNT nmiss = score.1 TO score.100 (SYSMIS) . PRESERVE. SET MXLOOPS=500. MATRIX. * Replace score.1 TO score.100 by your own variables *. GET data /VAR=score.1 TO score.100 /MISSING=ACCEPT /SYSMIS=1E3. GET nmiss /VAR=nmiss. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE validn=k-nmiss. COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. - ELSE. - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). - END IF. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX. RESTORE. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id nmiss. * END OF SYNTAX *. I've checked it with the fake data and it works. Marta |
Hi again Marta,
Now I'm really confused. Your code works perfectly as written (although I had to shorten the dummy variable name for SPSS 11.5), but as soon as I modify it (see below) and run it on my data I get a similar error, and the medians are worked out for the first 21 cases only: Run MATRIX procedure: >Error encountered in source line # 27 >Error # 12354 >Subscript is out of range. >This command not executed. Perhaps it's a problem with the way my variables are labelled, or with this version of SPSS (11.5)? It's clearly not your code that's the problem anyway. I might just have to switch to Excel to work these scores out, much as I hate to admit defeat... Thanks for all your time and help. Best wishes, Jennifer * SYNTAX FOR MEDIANS BEGINS HERE *. COMPUTE id=$casenum. * Replace score.1 TO score.100 by your own variables *. COUNT nmiss = b1_001 TO b1_100 (SYSMIS) . PRESERVE. SET MXLOOPS=500. MATRIX. * Replace score.1 TO score.100 by your own variables *. GET data /VAR=b1_001 TO b1_100 /MISSING=ACCEPT /SYSMIS=1E3. GET nmiss /VAR=nmiss. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE validn=k-nmiss. COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. - ELSE. - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). - END IF. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX. RESTORE. MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id nmiss. * END OF SYNTAX *. On 9/28/06, Marta García-Granero <[hidden email]> wrote: > Hi Jennifer > > JT> I tried the modified MATRIX code (see below) and encountered the following > JT> error: > > JT> Run MATRIX procedure: > >>Error encountered in source line # 160 > > >>Error # 12354 > >>Subscript is out of range. > >>This command not executed. > > > I have created a false dataset with a similar layout to the one yours > has (43 cases, 100 variables with scattered missing data): > * Some fake data simulating your data set *. > INPUT PROGRAM. > NUMERIC case vars (f8). > LEAVE ALL. > - LOOP case = 1 TO 42. > - LOOP vars = 1 TO 100. > - COMPUTE score=UNIFORM(5). > - END CASE. > - END LOOP. > - END LOOP. > END FILE. > END INPUT PROGRAM. > * Now adding some scattered missing data *. > RECODE score (Lowest thru 0.75=SYSMIS) . > * Reorganize it *. > SORT CASES BY case vars . > CASESTOVARS > /ID = case > /INDEX = vars > /GROUPBY = VARIABLE . > EXECUTE . > DELETE VARIABLES case. > * Fake cases are ready *. > > Now, try this modified version of the code (see the MXLOOPS I forgot > to add yesterday, low brain glucose levels were responsible for sure): > > > * SYNTAX FOR MEDIANS BEGINS HERE *. > COMPUTE id=$casenum. > * Replace score.1 TO score.100 by your own variables *. > COUNT nmiss = score.1 TO score.100 (SYSMIS) . > PRESERVE. > SET MXLOOPS=500. > MATRIX. > * Replace score.1 TO score.100 by your own variables *. > GET data /VAR=score.1 TO score.100 /MISSING=ACCEPT /SYSMIS=1E3. > GET nmiss /VAR=nmiss. > COMPUTE n=NROW(data). > COMPUTE k=NCOL(data). > COMPUTE validn=k-nmiss. > COMPUTE ranked=MAKE(n,k,0). > COMPUTE sorted=MAKE(n,k,0). > COMPUTE medians=MAKE(n,1,0). > LOOP i=1 TO n. > - COMPUTE ranked(i,:)=GRADE(data(i,:)). > - COMPUTE sorted(i,ranked(i,:))=data(i,:). > - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). > - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. > - ELSE. > - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). > - END IF. > END LOOP. > COMPUTE id={T(1:n)}. > COMPUTE namevec={'Medians','id'}. > SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. > PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. > END MATRIX. > RESTORE. > MATCH FILES /FILE=* > /FILE='C:\Temp\Medians.sav' > /BY id. > EXE. > DELETE VARIABLES id nmiss. > > * END OF SYNTAX *. > > I've checked it with the fake data and it works. > > Marta > |
In reply to this post by Marta García-Granero
I worked out the "wide to long" approach, using
only pretty vanilla SPSS - no OMS, even. You probably don't want to change to this to solve the practical problem, unless the MATRIX approach gets badly bogged down; but, to illustrate what you CAN do with the transformation language: At 03:25 PM 9/27/2006, Marta García-Granero wrote: >>RR> I might, myself, use the first approach, >>using "wide to long to wide" logic (my >>terminology) - in this case, "wide to long to AGGREGATE." >> >>RR> For "wide to long", VARSTOCASES is almost >>always cleaner and more reliable than is FLIP. >>Give us some test data and I'll give it a go with VARSTOCASES logic. Which I did. And had more trouble than I expected. It took me a while to calculate the medians correctly; I found that much the toughest part to get right. If you have MEDIAN function in AGGREGATE (SPSS 14+) it's much easier. >Is this closer to what you were suggesting, Richard?: [Code omitted. Logic uses FREQUENCIES to compute group medians, OMS to capture the values.] Below are three solutions. They use test data with variables besides the ones whose median is to be computed, and keep those variables. The first solution doesn't use AGGREGATE function MEDIAN; it's the only one that will work if you don't have SPSS 14. The second does use that function, and is therefore simpler. They both use MATCH FILES. Rather than using a scratch file, I've MATCHED with the original data; that requires the original data be in order by CASEID. The third solution is a tour de force; it does *not* use MATCH FILES, but it does some pretty fancy CASESTOVARS and AGGREGATE. It uses the MEDIAN function of AGGREGATE. It could probably be rewritten along the lines of the first, but the code would get truly convoluted. Enjoy it all! -Richard And, this is tested code; SPSS draft output: ............................................ * ...... Verify test data .............................. . NEW FILE. GET FILE=TESTDATA. LIST. List |-----------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:00 | |-----------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV CaseID TEXT QUANTITY v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 C.1 Alpha 12.34 1 1 3 5 1 6 3 . 9 5 C.2 Beta 5.67 2 3 1 5 7 4 9 7 8 3 C.3 Gamma 98.76 4 5 3 6 . 8 1 4 3 9 Number of cases read: 3 Number of cases listed: 3 * ...... Unroll and compute median, using .............. . * ...... AGGREGATE without MEDIAN function .............. . VARSTOCASES /MAKE Value FROM v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 /KEEP = CaseID /NULL = Drop /COUNT = N_Value "Number of valid values in record" . Variables to Cases Notes |--------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:00 | |--------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV Generated Variables |-------|---------------| |Name |Label | |-------|---------------| |N_Value|Number of valid| | |values in | | |record | |-------|---------------| |Value |<none> | |-------|---------------| Processing Statistics |-------------|--| |Variables In |13| |-------------|--| |Variables Out|3 | |-------------|--| SORT CASES BY CaseID Value. * If the number of values is odd, the median is the middle . * value: rank (N+1)/2 . * If the number of cases is even, the median is the mean . * of the two middle values: N/2, N/2 + 1. . * Or, together: the median is the mean of all values with . * rank between N/2 and N/2 + 1. . * ........................................................ . * Variable RANK is kept, and M_VALUE calculateded distinct . * from VALUE, are to allow clearer listing for debugging. . NUMERIC RANK(F3). NUMERIC M_VALUE(F3). DO IF MISSING(CASEID). . COMPUTE RANK = 1. ELSE IF CASEID NE LAG(CASEID). . COMPUTE RANK = 1. ELSE. . COMPUTE RANK = LAG(RANK) + 1. END IF. COMPUTE M_VALUE = VALUE. IF (RANK LT ( N_VALUE/2 - .001)) M_VALUE = $SYSMIS. IF (RANK GT ((N_VALUE/2 + 1) + .001)) M_VALUE = $SYSMIS. . /**/ LIST. List |-----------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:00 | |-----------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV CaseID N_Value Value RANK M_VALUE C.1 9 1 1 . C.1 9 1 2 . C.1 9 1 3 . C.1 9 3 4 . C.1 9 3 5 3 C.1 9 5 6 . C.1 9 5 7 . C.1 9 6 8 . C.1 9 9 9 . C.2 10 1 1 . C.2 10 2 2 . C.2 10 3 3 . C.2 10 3 4 . C.2 10 4 5 4 C.2 10 5 6 5 C.2 10 7 7 . C.2 10 7 8 . C.2 10 8 9 . C.2 10 9 10 . C.3 9 1 1 . C.3 9 3 2 . C.3 9 3 3 . C.3 9 4 4 . C.3 9 4 5 4 C.3 9 5 6 . C.3 9 6 7 . C.3 9 8 8 . C.3 9 9 9 . Number of cases read: 28 Number of cases listed: 28 AGGREGATE /OUTFILE=* /BREAK=CaseID /Median 'Median of v1 TO v10' = MEAN(M_Value). FORMATS MEDIAN (F5.1). MATCH FILES /FILE=TESTDATA /FILE=* /BY CASEID /KEEP = CASEID Text Quantity Median ALL. LIST. List |-----------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:00 | |-----------------------------|---------------------------| CaseID TEXT QUANTITY Median v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 C.1 Alpha 12.34 3.0 1 1 3 5 1 6 3 . 9 5 C.2 Beta 5.67 4.5 2 3 1 5 7 4 9 7 8 3 C.3 Gamma 98.76 4.0 4 5 3 6 . 8 1 4 3 9 Number of cases read: 3 Number of cases listed: 3 * ...... With AGGREGATE MEDIAN function - SPSS 14+ ...... . NEW FILE. GET FILE=TESTDATA. VARSTOCASES /MAKE Value FROM v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 /KEEP = CaseID /NULL = KEEP. Variables to Cases |--------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:00 | |--------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV Generated Variables |-----|------| |Name |Label | |-----|------| |Value|<none>| |-----|------| Processing Statistics |-------------|--| |Variables In |13| |-------------|--| |Variables Out|2 | |-------------|--| AGGREGATE /OUTFILE=* /BREAK=CaseID /Median 'Median of v1 TO v10' = MEDIAN(Value). FORMATS MEDIAN (F5.1). MATCH FILES /FILE=TESTDATA /FILE=* /BY CASEID /KEEP = CASEID Text Quantity Median ALL. LIST. List |-----------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:01 | |-----------------------------|---------------------------| CaseID TEXT QUANTITY Median v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 C.1 Alpha 12.34 3.0 1 1 3 5 1 6 3 . 9 5 C.2 Beta 5.67 4.5 2 3 1 5 7 4 9 7 8 3 C.3 Gamma 98.76 4.0 4 5 3 6 . 8 1 4 3 9 Number of cases read: 3 Number of cases listed: 3 * ...... With AGGREGATE MEDIAN function - SPSS 14+ ...... . * ...... Tour de force: No MATCH FILES ...... . NEW FILE. GET FILE=TESTDATA. VARSTOCASES /MAKE Value FROM v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 /KEEP = CaseID TEXT QUANTITY /NULL = KEEP. Variables to Cases |--------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:01 | |--------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV Generated Variables |-----|------| |Name |Label | |-----|------| |Value|<none>| |-----|------| Processing Statistics |-------------|--| |Variables In |13| |-------------|--| |Variables Out|4 | |-------------|--| AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=CaseID /Median 'Median of v1 TO v10' = MEDIAN(Value). FORMATS MEDIAN (F5.1). CASESTOVARS /ID = CaseID /GROUPBY = VARIABLE /AUTOFIX = YES /RENAME Value=V /SEPARATOR = "". Cases to Variables |--------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:01 | |--------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV Generated Variables |----------|------| |Original |Result| |Variabl |------| |e |Name | |-------|--|------| |Value |1 |V1 | | |2 |V2 | | |3 |V3 | | |4 |V4 | | |5 |V5 | | |6 |V6 | | |7 |V7 | | |8 |V8 | | |9 |V9 | | |10|V10 | |-------|--|------| Processing Statistics |---------------|----| |Cases In |30 | |Cases Out |3 | |---------------|----| |Cases In/Cases |10.0| |Out | | |---------------|----| |Variables In |5 | |Variables Out |14 | |---------------|----| |Index Values |10 | |---------------|----| LIST. List |-----------------------------|---------------------------| |Output Created |28-SEP-2006 10:56:21 | |-----------------------------|---------------------------| C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2006-09-27 García-Granero - Computing the median value of a group of variables.SAV CaseID TEXT QUANTITY Median V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 C.1 Alpha 12.34 3.0 1 1 3 5 1 6 3 . 9 5 C.2 Beta 5.67 4.5 2 3 1 5 7 4 9 7 8 3 C.3 Gamma 98.76 4.0 4 5 3 6 . 8 1 4 3 9 Number of cases read: 3 Number of cases listed: 3 |
Hi Richard
RR> Which I did. And had more trouble than I RR> expected. It took me a while to calculate the RR> medians correctly; I found that much the toughest RR> part to get right. If you have MEDIAN function in RR> AGGREGATE (SPSS 14+) it's much easier. I'm using SPSS 13 and the MEDIAN function in AGGREGATE is already there. I think it will be the better solution for Jennifer's problem, although perhaps the problemas that make my MATRIX code crash could make your follow the same path... Warmenst regards, Marta |
In reply to this post by Jennifer Thompson
Hi Jennifer
This is mystifying... A couple of questions before you abandon SPSS and go to Excel: 1) Are the 100 variables consecutive in the dataset, with no extraneous variables between? They keyword "TO" needs that the variables are truly consecutive. 2) Are your variables numeric, or are they strings with numeric content? MATRIX has a hard time handling strings Perhaps if you send me privately a sample of your data, as they are right now, as a sav file, I can run my code (or Richard's excellent solution not involving AGGREGATE, since your version 11.5 doesn't have MEDIAN as a function) and see what happens with your data JT> Now I'm really confused. Your code works perfectly as written JT> (although I had to shorten the dummy variable name for SPSS 11.5), JT> but as soon as I modify it (see below) and run it on my data I get JT> a similar error, and the medians are worked out for the first 21 JT> cases only: JT> Perhaps it's a problem with the way my variables are labelled, or JT> with this version of SPSS (11.5)? It's clearly not your code JT> that's the problem anyway. I might just have to switch to Excel JT> to work these scores out, much as I hate to admit defeat... Labelling is absolutely ignored by MATRIX. I hope we finally find a solution to your problem. Marta |
Hi Marta,
I think the mystery has been solved. One of my cases has completely missing data for the the first set of 100 variables. Once I'd deleted that case the matrix code worked perfectly. So, unsurprisingly, *I* was the problem and feel suitably foolish. I'd already tried running your previous version of the code without that dodgy case yesterday and it made no difference, so I didn't immediately think to try without it earlier. Thanks so much for your help. I'm also going to try and work through Richard Ristow's WideToLongToWide method for future reference. Very best wishes, Jennifer On 9/28/06, Marta García-Granero <[hidden email]> wrote: > Hi Jennifer > > This is mystifying... > > A couple of questions before you abandon SPSS and go to Excel: > > 1) Are the 100 variables consecutive in the dataset, with no > extraneous variables between? They keyword "TO" needs that the > variables are truly consecutive. > > 2) Are your variables numeric, or are they strings with numeric > content? MATRIX has a hard time handling strings > > Perhaps if you send me privately a sample of your data, as they are > right now, as a sav file, I can run my code (or Richard's excellent > solution not involving AGGREGATE, since your version 11.5 doesn't have > MEDIAN as a function) and see what happens with your data > > JT> Now I'm really confused. Your code works perfectly as written > JT> (although I had to shorten the dummy variable name for SPSS 11.5), > JT> but as soon as I modify it (see below) and run it on my data I get > JT> a similar error, and the medians are worked out for the first 21 > JT> cases only: > > JT> Perhaps it's a problem with the way my variables are labelled, or > JT> with this version of SPSS (11.5)? It's clearly not your code > JT> that's the problem anyway. I might just have to switch to Excel > JT> to work these scores out, much as I hate to admit defeat... > > Labelling is absolutely ignored by MATRIX. > > I hope we finally find a solution to your problem. > > Marta > |
Free forum by Nabble | Edit this page |