Multiple Mann-Whitney tests after Kruskall Wallis and more - Two MACROS

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Multiple Mann-Whitney tests after Kruskall Wallis and more - Two MACROS

Marta Garcia-Granero
Hi everybody

Time ago I posted a macro for multiple Mann-Whitney's test  followed by
several p-values adjustment algorthims. I sent the code with Spanish
output (the macro my students use). Here is the English output version,
with a sample dataset to show how to use the macro. Since this method
uses exact p-values, it will not work for big sample sizes (over 25). I
also have a macro for asymptotic post-hoc testing (see below the first
macro for the code)

Regards,
Marta García-Granero

* WARNING: C:\TEMP\FOLDER MUST EXIST *.

* Low sample size (exact) methods *.

DEFINE MULTIMW (!POSITIONAL !TOKENS(1)/!POSITIONAL !CHAREND('(')/
 !POSITIONAL !CHAREND(',')/!POSITIONAL !CHAREND(')') ).
SET OLANG=ENGLISH.
DATASET NAME Data.
TITLE'MULTIPLE COMPARISONS BASED ON PAIRWISE MANN-WHITNEY TESTS'.
OMS /SELECT TABLES
/IF SUBTYPE='Notes'
/DESTINATION VIEWER=NO.
OMS /SELECT HEADING
/IF COMMANDS='NPar Tests' LABEL='Title'
/DESTINATION VIEWER=NO.
OMS /SELECT TABLES
/IF COMMANDS='NPar Tests' SUBTYPES='Mann Whitney Test Statistics'
/DESTINATION FORMAT=SAV
 OUTFILE='C:\Temp\MultiMWU&P.sav'.
DO IF $CASENUM=1.
- WRITE OUTFILE 'C:\Temp\multiman.sps' /"NPAR TESTS".
- NUMERIC #I #J (F2.0).
- LOOP #I=!3 to !4-1.
-  LOOP #J=#I+1 to !4.
-   WRITE OUTFILE 'C:\Temp\multiman.sps'
   /"  /M-W= "!QUOTE(!1)" BY "!QUOTE(!2)" (" #I #J ")".
-  END LOOP.
- END LOOP.
- WRITE OUTFILE 'C:\Temp\multiman.sps' /".".
END IF.
EXECUTE.
INCLUDE FILE='C:\Temp\multiman.sps'.
ERASE FILE='C:\Temp\multiman.sps'.
OMSEND.
OMS /SELECT TABLES
/IF COMMANDS='Summarize' SUBTYPES='Case Processing Summary'
/DESTINATION VIEWER=NO.
GET FILE='C:\Temp\MultiMWU&P.sav' /DROP=Command_ TO Label_.
DATASET NAME PValues.
SELECT IF Var1='Exact Sig. [2*(1-tailed Sig.)]'.
EXECUTE.
DELETE VARIABLES Var1.
RENAME VARIABLES (ALL=pvalue).
COMPUTE id = $CASENUM.
FORMAT id (F2.0).
SORT CASES BY  pvalue (A) .
COMPUTE pos = $CASENUM.
FORMAT pos (F2.0).
PRESERVE.
SET ERRORS=NONE RESULTS=NONE.
RANK pvalue /n into N /PRINT = NO.
RESTORE.
COMPUTE bonferr=MIN(pvalue*n,1).
COMPUTE sidak=1-(1-pvalue)**n.
COMPUTE holm = MIN(1,(n-pos+1)*pvalue).
IF (holm LT LAG(holm)) holm = LAG(holm).
COMPUTE downsidk = 1-(1-pvalue)**(n-pos+1).
IF (downsidk LT LAG(downsidk)) downsidk = LAG(downsidk).
COMPUTE finner = 1-(1-pvalue)**(n/pos).
IF (finner LT LAG(finner)) finner = LAG(finner).
COMPUTE cn = cn+1/pos.
LEAVE cn.
SORT CASES BY pos(D).
IF cn LT LAG(cn) cn = LAG(cn).
COMPUTE hommel = MIN(1,cn*n*pvalue/pos).
IF (hommel GT LAG(hommel)) hommel = LAG(hommel).
COMPUTE hochberg = (n-pos+1)*pvalue.
IF (hochberg GT LAG(hochberg)) hochberg = LAG(hochberg).
COMPUTE simes = n*pvalue/pos.
IF (simes GT LAG(simes)) simes = LAG(simes).
EXECUTE.
DELETE VARIABLES pos,n,cn.
FORMAT pvalue bonferr to simes (F9.4).
VARIABLE LABELS id 'Nr.' /pvalue 'Original p-value'
       /bonferr 'One-step Bonferroni' /sidak 'One-step Dunn-Sidak'
       /holm 'Step-down Holm' /downsidk 'Step-down Dunn-Sidak'
       /finner 'Step-down Finner' /hommel 'Step-up Hommel'
       /hochberg 'Step-up Hochberg' /simes 'Step-up Simes'.
SORT CASES BY id (A).
SUMMARIZE
 /TABLES = pvalue bonferr TO simes
 /FORMAT = LIST NOCASENUM TOTAL
 /TITLE = 'Original & adjusted p-values'
 /MISSING = VARIABLE
 /CELLS = NONE.
OMSEND.
TITLE' '.
DATASET ACTIVATE Data.
DATASET CLOSE PValues.
ERASE FILE='C:\Temp\MultiMWU&P.sav'.
!ENDDEFINE.

* Sample dataset (replace by your own) *.
DATA LIST FREE/group (F8.0) glucose (F8.0).
BEGIN DATA
1 51 4 56 4 58 4 60 4 62 4 63 4 65 4 68 4 72 4 73
2 60 2 65 2 66 2 68 2 68 2 69 2 73 2 75 2 78 2 80
3 69 3 73 3 74 3 78 3 79 3 79 3 82 3 85 3 87 3 88
4 70 1 75 1 76 1 77 1 79 1 80 1 82 1 86 1 88 1 89
END DATA.
VAR LABEL group 'Type of Acidosis' /glucose 'Glucose levels'.
VALUE LABELS group 1'Control' 2'Respiratory' 3'Metabolic' 4'Mixed'.

* MACRO call *.
MULTIMW glucose group(1,4).


* Big sample size (asymptotic) methods *.

* See HANDBOOK OF PARAMETRIC AND NONPARAMETRIC STATISTICAL PROCEDURES
 Test 22 (David J Sheskin, 2000; Chapman & Hall) *.

****************************************************************
* Kruskal-Wallis test & Multiple Comparisons for large samples *
* Tukey-Kramer & SNK methods available for up to 20 groups,    *
* Dunnett test for 12 groups (1 control & 11 treatment groups) *
* Critical values for Dunnett test obtained from: Dunnett, CW  *
* (1964) "New tables for multiple comparisons with a control"  *
* Biometrics 20, 482-91                                        *
* Critical Values of the Studentized Range Statistic obtained  *
* from http://cse.niaes.affrc.go.jp/miwa/probcalc/             *
****************************************************************
* Important: 'C:\Temp' folder must exist *.

DEFINE MULTIKW (!POSITIONAL !TOKENS(1)/!POSITIONAL !CHAREND('(')/
!POSITIONAL !CHAREND(',')/!POSITIONAL !CHAREND(')') ).
PRESERVE.
SET OLANG=ENGLISH.
OMS /SELECT TABLES
/IF SUBTYPES='Kruskal Wallis Ranks'
/DESTINATION FORMAT=SAV
OUTFILE='C:\Temp\Kruskal-Wallis.sav'.
OMS /SELECT TABLES
/IF SUBTYPES='Kruskal Wallis Test Statistics'
/DESTINATION FORMAT=SAV
OUTFILE='C:\Temp\Kruskal-Wallis Chi-Square.sav'.
NPAR TESTS /K-W=!1 BY !2(!3 !4).
OMSEND.
OMS /SELECT TABLES
/IF SUBTYPE='Notes'
/DESTINATION VIEWER=NO.
OMS /SELECT TABLES
/IF COMMANDS='Summarize' SUBTYPES='Case Processing Summary'
/DESTINATION VIEWER=NO.
MATRIX.
PRINT /TITLE='MULTIPLE COMPARISONS (KRUSKAL-WALLIS): ASYMPTOTIC METHODS'.
GET data /VAR=ALL /FILE='C:\Temp\Kruskal-Wallis Chi-Square.sav'.
COMPUTE chisquar=data(1,5).
COMPUTE df=      data(2,5).
COMPUTE chisig=  data(3,5).
RELEASE data.
GET varname /VAR=Var1      /FILE='C:\Temp\Kruskal-Wallis.sav'.
GET gnames  /VAR=Var2      /FILE='C:\Temp\Kruskal-Wallis.sav'.
GET nvalues /VAR=n         /FILE='C:\Temp\Kruskal-Wallis.sav'.
GET meanrank /VAR=MeanRank /FILE='C:\Temp\Kruskal-Wallis.sav'
/MISSING=OMIT.
PRINT varname(1)
/FORMAT='A8'
/TITLE='Dependent variable'.
COMPUTE k=NROW(meanrank).
COMPUTE n=nvalues(1:k).
COMPUTE nt=nvalues(k+1).
PRINT nvalues
/FORMAT='F8.0'
/RNAMES=gnames
/TITLE='Total & groups sample sizes'.
PRINT df
/FORMAT='F8.0'
/TITLE='Between groups DF'.
RELEASE nvalues.
PRINT {chisquar,chisig}
/FORMAT='F8.4'
/CLABEL='Chi²','Sig.'
/TITLE='Kruskal-Wallis test & significance (asymptotic)'.
COMPUTE ranksum=meanrank&*n.
COMPUTE VarRi=(MSUM(ranksum&**2/n)-(MSUM(ranksum))&**2/nt)/chisquar.
COMPUTE se_rankd=MAKE(k-1,1,0).
COMPUTE drankd=MAKE(k-1,1,0).
COMPUTE compard=MAKE(k-1,3,' ').
LOOP i=1 TO k-1.
- COMPUTE se_rankd(i)=SQRT((1/n(1)+1/n(i+1))*VarRi).
- COMPUTE drankd(i)=ABS(meanrank(1)-meanrank(i+1)).
- COMPUTE compard(i,:)={gnames(1),'  vs    ',gnames(i+1)}.
END LOOP.
COMPUTE dunnett={1.96,2.21,2.35,2.41,2.51,2.57,2.65,2.72,2.77,2.83,2.91;
2.58,2.79,2.92,3.00,3.06,3.11,3.19,3.25,3.29,3.35,3.42}.
COMPUTE dunnett5=dunnett(1,k-1)*se_rankd.
COMPUTE dunnett1=dunnett(2,k-1)*se_rankd.
COMPUTE sigd=MAKE(k-1,1,' ').
LOOP i=1 TO k-1.
- DO IF drankd(i) GE dunnett1(i).
-  COMPUTE sigd(i)='   <0.01'.
- ELSE IF drankd(i) GE dunnett5(i).
-  COMPUTE sigd(i)='   <0.05'.
- ELSE.
-  COMPUTE sigd(i)='      NS'.
- END IF.
END LOOP.
PRINT {compard,sigd}
/FORMAT='A8'
/TITLE='Dunnett method (1st group is taken as reference)'.
RELEASE se_rankd,drankd,compard,dunnett,dunnett5,dunnett1,sigd.
COMPUTE sranks=meanrank.
COMPUTE sranks(GRADE(meanrank))=meanrank.
COMPUTE snames=gnames(1:k).
COMPUTE snames(GRADE(meanrank))=gnames(1:k).
COMPUTE sn=n.
COMPUTE sn(GRADE(meanrank))=n.
PRINT {T(sranks)}
/FORMAT='F8.2'
/CNAME=snames
/RNAME=varname
/TITLE='Mean Ranks (sorted)'.
RELEASE gnames,n,meanrank.
COMPUTE compar=MAKE(k*(k-1)/2,3,' ').
COMPUTE drank=MAKE(k*(k-1)/2,1,0).
COMPUTE se_rank=MAKE(k*(k-1)/2,1,0).
COMPUTE pvalue=MAKE(k*(k-1)/2,1,0).
COMPUTE sig=MAKE(k*(k-1)/2,1,' ').
COMPUTE count=0.
LOOP i=1 TO k-1.
- LOOP j=i+1 TO k.
-  COMPUTE count=count+1.
-  COMPUTE compar(count,:)={snames(i),'   vs   ',snames(j)}.
-  COMPUTE drank(count)=ABS(sranks(i)-sranks(j)).
-  COMPUTE se_rank(count)=SQRT((1/sn(i)+1/sn(j))*VarRi).
-  COMPUTE pvalue(count)=2*(1-CDFNORM(drank(count)/se_rank(count))).
- END LOOP.
END LOOP.
LOOP i=1 TO count.
- DO IF pvalue(i) LE 0.01.
-  COMPUTE sig(i)='   <0.01'.
- ELSE IF pvalue(i) LE 0.05.
-  COMPUTE sig(i)='   <0.05'.
- ELSE.
-  COMPUTE sig(i)='      NS'.
- END IF.
END LOOP.
PRINT {compar,sig}
/FORMAT='A8'
/TITLE='LSD comparisons'.
COMPUTE apvalue=1-(1-pvalue)&**count.
LOOP i=1 TO count.
- DO IF apvalue(i) LE 0.01.
-  COMPUTE sig(i)='   <0.01'.
- ELSE IF apvalue(i) LE 0.05.
-  COMPUTE sig(i)='   <0.05'.
- ELSE.
-  COMPUTE sig(i)='      NS'.
- END IF.
END LOOP.
PRINT {compar,sig}
/FORMAT='A8'
/TITLE='Dunn-Sidak comparisons'.
COMPUTE
studrang={2.772,3.314,3.633,3.858,4.030,4.170,4.286,4.387,4.474,4.552,
                 4.622,4.685,4.743,4.796,4.845,4.891,4.934,4.974,5.012;

3.643,4.120,4.403,4.603,4.757,4.882,4.987,5.078,5.157,5.227,
                 5.290,5.348,5.400,5.448,5.493,5.535,5.574,5.611,5.645}.
COMPUTE ctukey5=studrang(1,k-1)*se_rank/SQRT(2).
COMPUTE ctukey1=studrang(2,k-1)*se_rank/SQRT(2).
LOOP i=1 TO count.
- DO IF drank(i) GE ctukey1(i).
-  COMPUTE sig(i)='   <0.01'.
- ELSE IF drank(i) GE ctukey5(i).
-  COMPUTE sig(i)='   <0.05'.
- ELSE.
-  COMPUTE sig(i)='      NS'.
- END IF.
END LOOP.
PRINT {compar,sig}
/FORMAT='A8'
/TITLE='Tukey comparisons'.
COMPUTE harmonic=k/CSUM(1/sn).
COMPUTE serank=SQRT((2/harmonic)*nt*(nt+1)/12).
COMPUTE snk5=MAKE(k-1,1,0).
COMPUTE snk1=MAKE(k-1,1,0).
LOOP i=1 TO k-1.
- COMPUTE snk5(i)=studrang(1,k-i)*serank/SQRT(2).
- COMPUTE snk1(i)=studrang(2,k-i)*serank/SQRT(2).
END LOOP.
COMPUTE step=MAKE(k*(k-1)/2,1,0).
COMPUTE count=0.
LOOP i=1 TO k.
- LOOP j=i+1 TO k.
-  COMPUTE count=count+1.
-  COMPUTE drank(count)=ABS(sranks(i)-sranks(j)).
-  COMPUTE step(count)=k+i-j.
- END LOOP.
END LOOP.
LOOP i=1 TO count.
- DO IF drank(i) GE snk1(step(i)).
-  COMPUTE sig(i)='   <0.01'.
- ELSE IF drank(i) GE snk5(step(i)).
-  COMPUTE sig(i)='   <0.05'.
- ELSE.
-  COMPUTE sig(i)='      NS'.
- END IF.
END LOOP.
PRINT {compar,sig}
/FORMAT='A8'
/TITLE='Student-Newman-Keuls comparisons'.
DO IF CMIN(sn) NE CMAX(sn).
- PRINT harmonic
/FORMAT='F8.2'
/TITLE='Unequal sample sizes, harmonic mean of sample sizes used (alpha
level not guaranteed)'
/RLABEL=' N(h)='.
END IF.
END MATRIX.
OMSEND.
RESTORE.
!ENDDEFINE.

* MACRO call (using same dataset as before) *.
MULTIKW glucose group(1,4).

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD