Multivariate Normality Test in SPSS

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Multivariate Normality Test in SPSS

3J LEMA
How can we test multivariate normality in SPSS?
Any suggestions?

Thank you. Regards.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Bruce Weaver
Administrator
The official answer from IBM (as of April 2020) appears to be NO.  

https://www.ibm.com/support/pages/does-ibm-spss-statistics-offer-test-multivariate-normality

Why do you want to test for multivariate normality?  If you're working with
real world data, approximate normality is the best you can hope for in any
case--at least if you believe what George Box famously said about the normal
distribution.


"In applying mathematics to subjects such as physics or statistics we make
tentative assumptions about the real world which we know are false but which
we believe may be useful nonetheless. The physicist knows that particles
have mass and yet certain results, approximating what really happens, may be
derived from the assumption that they do not. Equally, the statistician
knows, for example, that in nature there never was a normal distribution,
there never was a straight line, yet with normal and linear assumptions,
known to be false, he can often derive results which match, to a  useful
approximation, those found in the real world."  

http://mkweb.bcgsc.ca/pointsofsignificance/img/Boxonmaths.pdf



3J LEMA wrote
> How can we test multivariate normality in SPSS?
> Any suggestions?
>
> Thank you. Regards.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Kirill Orlov
In reply to this post by 3J LEMA
Hi,
Just found in my old downloads.
Never tried this one.

*http://www.columbia.edu/~ld208/
* Mardia's multivariate skew (b1p) and multivariate kurtosis (b2p)
* Author: Lawrence T. DeCarlo, 11/97
* Email: [hidden email]
*
* Multivariate skew is provided in a separate macro because it
* is more computationally intense, particularly for large datasets
*
* Note: increase mxloops if n>3000
preserve.
set printback=none.
define mardia (vars=!charend('/')).
set mxloops=3000.
matrix.
get x /variables=!vars /names=varnames /missing=omit.
compute n=nrow(x).
compute p=ncol(x).
compute xbar=csum(x)/n.
compute j=make(n,1,1).
compute xdev=x-j*xbar.
release x.
compute s=sscp(xdev)/n.
compute sinv=inv(s).
compute gii=make(n,1,0).
compute gij=make(n,1,0).
compute gsum=make(n,1,0).
loop i=1 to n.
+ compute gii(i)=xdev(i,:)*sinv*t(xdev(i,:)).
+ loop j=1 to n.
+   compute gij(j)=xdev(i,:)*sinv*t(xdev(j,:)).
+ end loop.
+ compute gsum(i)=csum(gij&**3).
end loop.
compute b1p=csum(gsum)/(n*n).
compute chib1p=(n*b1p)/6.
compute sm=((p+1)*(n+1)*(n+3))/(n*((n+1)*(p+1)-6)).
compute chism=(n*b1p*sm)/6.
compute df=(p*(p+1)*(p+2))/6.
compute pb1p=1-chicdf(chib1p,df).
compute pb1psm=1-chicdf(chism,df).
print {b1p,chib1p,pb1p,chism,pb1psm}
  /title"Mardia's multivariate skew (small sample adjustment: Mardia
1974 Sankya)".
  /clabels="b1p","Chi(b1p)","p-value","adj-Chi","p-value"/format=f10.4.
compute b2p=csum(gii&**2)/n.
compute nb2p=(b2p-p*(p+2))/sqrt(8*p*(p+2)/n).
compute pnb2p=2*(1-cdfnorm(abs(nb2p))).
print {b2p,nb2p,pnb2p}.
  /title"Mardia's multivariate kurtosis"
  /clabels="b2p","N(b2p)","p-value"/format=f10.4.
end matrix.
!enddefine.
restore.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Kirill Orlov
In reply to this post by 3J LEMA
More findings. Probably the same or similar


*Taken from http://www.columbia.edu/~ld208/
*univariate and multivariate tests of skew and kurtosis.
*(from my 1997 Psychological Methods
<http://www.columbia.edu/~ld208/pub.html> article)
*****************************************************************************
* Univariate and multivariate tests of skew and kurtosis, a list of
* the 5 cases with the largest Mahalanobis distances, a plot of the
* squared distances, critical values for a single multivariate outlier.
*
* from: DeCarlo, L. T. (1997). On the meaning and use of kurtosis.
*         Psychological Methods, 2, 292-307.
*
* Update 2/03: plot command not supported in SPSS 11.0, graph
* command is used instead.
*
* Updated 11/97:
* This version uses a corrected two-pass algorithm to compute
* the variance, from Chan, T. F., Golub, G. H., & LeVeque, R. J.
* (1983). Algorithms for computing the sample variance: Analysis
* and recommendations. American Statistician, 37, 242-247.
* Fisher's g statistics are given.
* Mardia's p-value fixed (multiplied by 2), and the statistic is
* computed using the biased variance estimator, as in SAS & EQS
*****************************************************************************.
define !normtest(vars=!charend('/')).
matrix.
get x /variables=!vars /names=varnames /missing=omit.
compute n=nrow(x).
compute p=ncol(x).
compute s1=csum(x).
compute xbar=s1/n.
compute j=make(n,1,1).
compute xdev=x-j*xbar.
release x.
compute dev=csum(xdev).
compute devsq=(dev&*dev)/n.
compute ss=csum(xdev&*xdev).
* corrected two-pass algorithm.
compute m2=(ss-devsq)/n.
compute sdev=sqrt(m2).
compute m3=csum(xdev&**3)/n.
compute m4=csum(xdev&**4)/n.
compute sqrtb1=t(m3/(m2&*sdev)).
compute b2=t(m4/(m2&**2)).
compute g1=((sqrt(n*(n-1)))*sqrtb1)/(n-2).
compute g2=(b2-((3*(n-1))/(n+1)))*((n**2-1)/((n-2)*(n-3))).
******** quantities needed for multivariate statistics ********.
compute s=sscp(xdev)/(n-1).
compute sb=s*(n-1)/n.
compute sinv=inv(s).
compute d=diag(s).
compute dmat=make(p,p,0).
call setdiag(dmat,d).
compute sqrtdinv=inv(sqrt(dmat)).
compute corr=sqrtdinv*s*sqrtdinv.
*** principal components for Srivastava's tests ***.
call svd(s,u,q,v).
compute pc=xdev*v.
call svd(sb,aa,bb,cc).
compute pcb=(xdev*cc).
release xdev.
*** Mahalanobis distances ***.
compute sqrtqinv=inv(sqrt(q)).
compute stdpc=pc*sqrtqinv.
compute dsq=rssq(stdpc).
release stdpc.
compute sqrtbbi=inv(sqrt(bb)).
compute stdpcb=pcb*sqrtbbi.
compute dsqb=rssq(stdpcb).
release stdpcb.
**************** univariate skew and kurtosis *****************.
*** approximate Johnson's SU transformation for skew ***.
compute y=sqrtb1*sqrt((n+1)*(n+3)/(6*(n-2))).
compute beta2=3*(n**2+27*n-70)*(n+1)*(n+3)/((n-2)*(n+5)*(n+7)*(n+9)).
compute w=sqrt(-1+sqrt(2*(beta2-1))).
compute delta=1/sqrt(ln(w)).
compute alpha=sqrt(2/(w*w-1)).
compute sub1=delta*ln(y/alpha+sqrt((y/alpha)&**2+1)).
compute psub1=2*(1-cdfnorm(abs(sub1))).
print {n} /title"Number of observations:" /format=f5.
print {p} /title"Number of variables:" /format=f5.
print {g1,sqrtb1,sub1,psub1} /title "Measures and tests of skew:"
/clabels="g1","sqrt(b1)","z(b1)","p-value"
  /rnames=varnames /format=f10.4.
*** Anscombe & Glynn's transformation for kurtosis.
compute eb2=3*(n-1)/(n+1).
compute vb2=24*n*(n-2)*(n-3)/(((n+1)**2)*(n+3)*(n+5)).
compute stm3b2=(b2-eb2)/sqrt(vb2).
compute
beta1=6*(n*n-5*n+2)/((n+7)*(n+9))*sqrt(6*(n+3)*(n+5)/(n*(n-2)*(n-3))).
compute a=6+(8/beta1)*(2/beta1+sqrt(1+4/(beta1**2))).
compute
zb2=(1-2/(9*a)-((1-2/a)/(1+stm3b2*sqrt(2/(a-4))))&**(1/3))/sqrt(2/(9*a)).
compute pzb2=2*(1-cdfnorm(abs(zb2))).
compute b2minus3=b2-3.
print {g2,b2minus3,zb2,pzb2} /title "Measures and tests of kurtosis:"
/clabels="g2","b2-3","z(b2)","p-value"
  /rnames=varnames /format=f10.4.
compute ksq=sub1&**2+zb2&**2.
compute pksq=1-chicdf(ksq,2).
compute lm=n*((sqrtb1&**2/6)+(b2minus3&**2/24)).
compute plm=1-chicdf(lm,2).
print /title "Omnibus tests of normality (both chisq, 2 df):".
print {ksq,pksq,lm,plm} /title "  D'Agostino & Pearson K sq Jarque &
Bera LM test"
  /clabels="K sq","p-value","LM","p-value" /rnames=varnames /format=f10.4.
do if p>1.
print /title "*************** Multivariate Statistics ***************".
*** Small's multivariate tests ***.
compute uinv=inv(corr&**3).
compute uinv2=inv(corr&**4).
compute q1=t(sub1)*uinv*sub1.
* note: the variant of Small's kurtosis uses Anscombe & Glynn's
* transformation in lieu of SU (A & G is simpler to program).
compute q2=t(zb2)*uinv2*zb2.
compute pq1=1-chicdf(q1,p).
compute pq2=1-chicdf(q2,p).
print /title "Tests of multivariate skew:".
print {q1,p,pq1} /title "  Small's test (chisq)"
/clabels="Q1","df","p-value" /format=f10.4.
*** Srivastava's multivariate tests ***.
compute pcs1=csum(pc).
compute pcs2=csum(pc&**2).
compute pcs3=csum(pc&**3).
compute pcs4=csum(pc&**4).
release pc.
compute mpc2=(pcs2-(pcs1&**2/n))/n.
compute mpc3=(pcs3-(3/n*pcs1&*pcs2)+(2/(n**2)*(pcs1&**3)))/n.
compute
mpc4=(pcs4-(4/n*pcs1&*pcs3)+(6/(n**2)*(pcs2&*(pcs1&**2)))-(3/(n**3)*(pcs1&**4)))/n.
compute pcb1=mpc3/(mpc2&**1.5).
compute pcb2=mpc4/(mpc2&**2).
compute sqb1p=rsum(pcb1&**2)/p.
compute b2p=rsum(pcb2)/p.
compute chib1=sqb1p*n*p/6.
compute normb2=(b2p-3)*sqrt(n*p/24).
compute pchib1=1-chicdf(chib1,p).
compute pnormb2=2*(1-cdfnorm(abs(normb2))).
print {chib1,p,pchib1} /title "  Srivastava's test"
/clabels="chi(b1p)","df","p-value" /format=f10.4.
print /title "Tests of multivariate kurtosis:".
print {q2,p,pq2} /title "  A variant of Small's test (chisq)"
/clabels="VQ2","df","p-value" /format=f10.4.
print {b2p,normb2,pnormb2} /title "  Srivastava's test"
/clabels="b2p","N(b2p)","p-value" /format=f10.4.
*** Mardia's multivariate kurtosis ***.
compute b2pm=csum(dsqb&**2)/n.
compute nb2pm=(b2pm-p*(p+2))/sqrt(8*p*(p+2)/n).
compute pnb2pm=2*(1-cdfnorm(abs(nb2pm))).
print {b2pm,nb2pm,pnb2pm} /title "  Mardia's test"
/clabels="b2p","N(b2p)","p-value" /format=f10.4.
compute q3=q1+q2.
compute q3df=2*p.
compute pq3=1-chicdf(q3,q3df).
print /title "Omnibus test of multivariate normality:".
print {q3,q3df,pq3} /title"  (based on Small's test, chisq)"
/clabels="VQ3","df","p-value" /format=f10.4.
end if.
compute cse={1:n}.
compute case=t(cse).
compute rnk=rnkorder(dsq).
compute top=(n+1)-rnk.
compute pvar=make(n,1,p).
compute ddf=make(n,1,(n-p-1)).
compute ncase=make(n,1,n).
compute a01=make(n,1,(1-.01/n)).
compute a05=make(n,1,(1-.05/n)).
compute mahal={case,rnk,top,dsq,pvar,ddf,ncase,a01,a05}.
save mahal /outfile=temp /variables=case,rnk,top,dsq,pvar,ddf,ncase,a01,a05.
end matrix.
get file=temp.
sort cases by top (a).
do if case=1.
compute f01=idf.f(a01,pvar,ddf).
compute f05=idf.f(a05,pvar,ddf).
compute fc01=(f01*pvar*(ncase-1)**2)/(ncase*(ddf+pvar*f01)).
compute fc05=(f05*pvar*(ncase-1)**2)/(ncase*(ddf+pvar*f05)).
print space.
print /'Critical values (Bonferroni) for a single multivar. outlier:'.
print space.
print /'  critical F(.05/n) ='fc05 (f5.2)'  df ='pvar (f3)','ddf (f4).
print /'  critical F(.01/n) ='fc01 (f5.2)'  df ='pvar (f3)','ddf (f4).
print space.
print /'5 observations with largest Mahalanobis distances:'.
end if.
execute.
do if top < 6.
print /'  rank ='top (f2)'  case# ='case (f4)'  Mahal D sq ='dsq (f10.2).
end if.
execute.
compute chisq=idf.chisq((rnk-.5)/ncase,pvar).
graph /title="Plot of ordered squared distances"
/scatterplot(overlay)=dsq with chisq.
execute.
!enddefine.
*-----------------------------.

!normtest  vars=  x y z.






Andy Wheeler once edited a bit  the above macro.
*Some edits to this old syntax file - see
http://www.columbia.edu/~ld208/normtest.sps.
*For original and this NABBLE thread for discussion - .
*http://spssx-discussion.1045642.n5.nabble.com/errors-in-DeCarlo-s-macro-for-multivariate-normality-td5727898.html#a5727904.


*****************************************************************************
* Univariate and multivariate tests of skew and kurtosis, a list of the
* 5 cases with the largest Mahalanobis distances, a plot of the
* squared distances, critical values for a single multivariate outlier.
*
* from: DeCarlo, L. T. (1997). On the meaning and use of kurtosis.
*         Psychological Methods, 2, 292-307.
*
* To use the macro, one needs two lines, one to include the macro
* in the program, and the other to execute it. Open the data file, then
* type the commands in a syntax window as follows:
*
* include 'c:\spsswin\normtest.sps'.
* \* normtest vars=x1,x2,x3,x4 /. *\
*
* The first line includes the macro, which in this case is named
* normtest.sps and is located in the spsswin directory, and the
* second line invokes the macro for variables x1 to x4, for example.
* (variable names can be separated by spaces or commas)
*
* Updated 2002: the plot command of SPSS is replaced by graph
*
* Updated 11/97:
* This version uses a corrected two-pass algorithm to compute
* the variance, from Chan, T. F., Golub, G. H., & LeVeque, R. J.
* (1983). Algorithms for computing the sample variance: Analysis
* and recommendations. American Statistician, 37, 242-247.
* Fisher's g statistics are given.
* Mardia's p-value fixed (multiplied by 2), and the statistic is
* computed using the biased variance estimator, as in SAS & EQS
*****************************************************************************.
define !normtest (vars=!charend('/')).
matrix.
get x /variables=!vars /names=varnames /missing=omit.
compute n=nrow(x).
compute p=ncol(x).
compute s1=csum(x).
compute xbar=s1/n.
compute j=make(n,1,1).
compute xdev=x-j*xbar.
release x.
compute dev=csum(xdev).
compute devsq=(dev&*dev)/n.
compute ss=csum(xdev&*xdev).
* corrected two-pass algorithm.
compute m2=(ss-devsq)/n.
compute sdev=sqrt(m2).
compute m3=csum(xdev&**3)/n.
compute m4=csum(xdev&**4)/n.
compute sqrtb1=t(m3/(m2&*sdev)).
compute b2=t(m4/(m2&**2)).
compute g1=((sqrt(n*(n-1)))*sqrtb1)/(n-2).
compute g2=(b2-((3*(n-1))/(n+1)))*((n**2-1)/((n-2)*(n-3))).
******** quantities needed for multivariate statistics ********.
compute s=sscp(xdev)/(n-1).
compute sb=s*(n-1)/n.
compute sinv=inv(s).
compute d=diag(s).
compute dmat=make(p,p,0).
call setdiag(dmat,d).
compute sqrtdinv=inv(sqrt(dmat)).
compute corr=sqrtdinv*s*sqrtdinv.
*** principal components for Srivastava's tests ***.
call svd(s,u,q,v).
compute pc=xdev*v.
call svd(sb,aa,bb,cc).
compute pcb=(xdev*cc).
release xdev.
*** Mahalanobis distances ***.
compute sqrtqinv=inv(sqrt(q)).
compute stdpc=pc*sqrtqinv.
compute dsq=rssq(stdpc).
release stdpc.
compute sqrtbbi=inv(sqrt(bb)).
compute stdpcb=pcb*sqrtbbi.
compute dsqb=rssq(stdpcb).
release stdpcb.
**************** univariate skew and kurtosis *****************.
*** approximate Johnson's SU transformation for skew ***.
compute y=sqrtb1*sqrt((n+1)*(n+3)/(6*(n-2))).
compute beta2=3*(n**2+27*n-70)*(n+1)*(n+3)/((n-2)*(n+5)*(n+7)*
                 (n+9)).
compute w=sqrt(-1+sqrt(2*(beta2-1))).
compute delta=1/sqrt(ln(w)).
compute alpha=sqrt(2/(w*w-1)).
compute sub1=delta*ln(y/alpha+sqrt((y/alpha)&**2+1)).
compute psub1=2*(1-cdfnorm(abs(sub1))).
print {n}/title"Number of observations:" /format=f5.
print {p}/title"Number of variables:" /format=f5.
print {g1,sqrtb1,sub1,psub1}
  /title"Measures and tests of skew:"
  /clabels="g1","sqrt(b1)","z(b1)","p-value"
  /rnames=varnames /format=f10.4.
*** Anscombe & Glynn's transformation for kurtosis.
compute eb2=3*(n-1)/(n+1).
compute vb2=24*n*(n-2)*(n-3)/(((n+1)**2)*(n+3)*(n+5)).
compute stm3b2=(b2-eb2)/sqrt(vb2).
compute beta1=6*(n*n-5*n+2)/((n+7)*(n+9))*sqrt(6*(n+3)*(n+5)/
                 (n*(n-2)*(n-3))).
compute a=6+(8/beta1)*(2/beta1+sqrt(1+4/(beta1**2))).
compute zb2=(1-2/(9*a)-((1-2/a)/(1+stm3b2*sqrt(2/(a-4))))
             &**(1/3))/sqrt(2/(9*a)).
compute pzb2=2*(1-cdfnorm(abs(zb2))).
compute b2minus3=b2-3.
print {g2,b2minus3,zb2,pzb2}
  /title"Measures and tests of kurtosis:"
  /clabels="g2","b2-3","z(b2)","p-value"
  /rnames=varnames /format=f10.4.
compute ksq=sub1&**2+zb2&**2.
compute pksq=1-chicdf(ksq,2).
compute lm=n*((sqrtb1&**2/6)+(b2minus3&**2/24)).
compute plm=1-chicdf(lm,2).
print
  /title"Omnibus tests of normality (both chisq, 2 df):".
print {ksq,pksq,lm,plm}
  /title"  D'Agostino & Pearson K sq    Jarque & Bera LM test"
  /clabels="K sq","p-value","LM","p-value"
  /rnames=varnames /format=f10.4.
do if p>1.
print
  /title"*************** Multivariate Statistics ***************".
*** Small's multivariate tests ***.
compute uinv=inv(corr&**3).
compute uinv2=inv(corr&**4).
compute q1=t(sub1)*uinv*sub1.
* note: the variant of Small's kurtosis uses Anscombe & Glynn's.
* transformation in lieu of SU (A & G is simpler to program).
compute q2=t(zb2)*uinv2*zb2.
compute pq1=1-chicdf(q1,p).
compute pq2=1-chicdf(q2,p).
print /title"Tests of multivariate skew:".
print {q1,p,pq1}/title"  Small's test (chisq)"
  /clabels="Q1","df","p-value"/format=f10.4.
*** Srivastava's multivariate tests ***.
compute pcs1=csum(pc).
compute pcs2=csum(pc&**2).
compute pcs3=csum(pc&**3).
compute pcs4=csum(pc&**4).
release pc.
compute mpc2=(pcs2-(pcs1&**2/n))/n.
compute mpc3=(pcs3-(3/n*pcs1&*pcs2)+(2/(n**2)*(pcs1&**3)))/n.
compute mpc4=(pcs4-(4/n*pcs1&*pcs3)+(6/(n**2)*(pcs2&*(pcs1&**2)))
             -(3/(n**3)*(pcs1&**4)))/n.
compute pcb1=mpc3/(mpc2&**1.5).
compute pcb2=mpc4/(mpc2&**2).
compute sqb1p=rsum(pcb1&**2)/p.
compute b2p=rsum(pcb2)/p.
compute chib1=sqb1p*n*p/6.
compute normb2=(b2p-3)*sqrt(n*p/24).
compute pchib1=1-chicdf(chib1,p).
compute pnormb2=2*(1-cdfnorm(abs(normb2))).
print {chib1,p,pchib1}
  /title"  Srivastava's test"
  /clabels="chi(b1p)","df","p-value"/format=f10.4.
print /title"Tests of multivariate kurtosis:".
print {q2,p,pq2}
  /title"  A variant of Small's test (chisq)"
  /clabels="VQ2","df","p-value"/format=f10.4.
print {b2p,normb2,pnormb2}
  /title"  Srivastava's test"
  /clabels="b2p","N(b2p)","p-value"/format=f10.4.
*** Mardia's multivariate kurtosis ***.
compute b2pm=csum(dsqb&**2)/n.
compute nb2pm=(b2pm-p*(p+2))/sqrt(8*p*(p+2)/n).
compute pnb2pm=2*(1-cdfnorm(abs(nb2pm))).
print {b2pm,nb2pm,pnb2pm}
  /title"  Mardia's test"
  /clabels="b2p","N(b2p)","p-value"/format=f10.4.
compute q3=q1+q2.
compute q3df=2*p.
compute pq3=1-chicdf(q3,q3df).
print /title"Omnibus test of multivariate normality:".
print {q3,q3df,pq3}
  /title"  (based on Small's test, chisq)"
  /clabels="VQ3","df","p-value"/format=f10.4.
end if.
compute cse={1:n}.
compute case=t(cse).
compute rnk=rnkorder(dsq).
compute top=(n+1)-rnk.
compute pvar=make(n,1,p).
compute ddf=make(n,1,(n-p-1)).
compute ncase=make(n,1,n).
compute a01=make(n,1,(1-.01/n)).
compute a05=make(n,1,(1-.05/n)).
compute mahal={case,rnk,top,dsq,pvar,ddf,ncase,a01,a05}.
save mahal /outfile=*
  /variables=case,rnk,top,dsq,pvar,ddf,ncase,a01,a05.
end matrix.
dataset name temp.
dataset activate temp.
sort cases by top (a).
do if case=1.
compute f01=idf.f(a01,pvar,ddf).
compute f05=idf.f(a05,pvar,ddf).
compute fc01=(f01*pvar*(ncase-1)**2)/(ncase*(ddf+pvar*f01)).
compute fc05=(f05*pvar*(ncase-1)**2)/(ncase*(ddf+pvar*f05)).
print space.
print
  /'Critical values (Bonferroni) for a single multivar. outlier:'.
print space.
print
  /'  critical F(.05/n) ='fc05 (f5.2)'  df ='pvar (f3)','ddf (f4).
print
  /'  critical F(.01/n) ='fc01 (f5.2)'  df ='pvar (f3)','ddf (f4).
print space.
print /'5 observations with largest Mahalanobis distances:'.
end if.
execute.
do if top < 6.
print
  /'  rank ='top (f2)'  case# ='case (f4)'  Mahal D sq ='dsq (f10.2).
end if.
execute.
compute chisq=idf.chisq((rnk-.5)/ncase,pvar).
graph
  /title="Plot of ordered squared distances"
  /scatterplot (overlay)=dsq with chisq.
execute.
!enddefine.


*Example use.
dataset close all.
output close all.
MATRIX.
SAVE {UNIFORM(30,4)} /OUTFILE = * /VARS = X1 TO X4.
END MATRIX.
dataset name data.
preserve.
set mprint on.
*uncomment to run.
/* !normtest vars=X1,X2,X3,X4 /.
restore.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Jeff A
In reply to this post by Bruce Weaver
Interesting quote by Box - I have that one saved. Interesting to me how many
folks seem to mistakenly believe that IVs and DVs have to be normally
distributed for OLS regression, that missing data have to be MAR, and I
don't know how many other similar things that are misleading at best and are
often just simply wrong. I just questioned a grad student of mine who
performed some complex transformation of a variable and did some complex
thing that was very difficult to follow just to get results that were nearly
the exact same thing as simple OLS with variables in the original metric
when the discussion of the findings was to simply note whether something was
positively or negatively related to the outcome.

Jeff


-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Bruce
Weaver
Sent: Thursday, April 22, 2021 10:59 PM
To: [hidden email]
Subject: Re: Multivariate Normality Test in SPSS

The official answer from IBM (as of April 2020) appears to be NO.  

https://www.ibm.com/support/pages/does-ibm-spss-statistics-offer-test-multiv
ariate-normality

Why do you want to test for multivariate normality?  If you're working with
real world data, approximate normality is the best you can hope for in any
case--at least if you believe what George Box famously said about the normal
distribution.


"In applying mathematics to subjects such as physics or statistics we make
tentative assumptions about the real world which we know are false but which
we believe may be useful nonetheless. The physicist knows that particles
have mass and yet certain results, approximating what really happens, may be
derived from the assumption that they do not. Equally, the statistician
knows, for example, that in nature there never was a normal distribution,
there never was a straight line, yet with normal and linear assumptions,
known to be false, he can often derive results which match, to a  useful
approximation, those found in the real world."  

http://mkweb.bcgsc.ca/pointsofsignificance/img/Boxonmaths.pdf



3J LEMA wrote
> How can we test multivariate normality in SPSS?
> Any suggestions?
>
> Thank you. Regards.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to
> manage subscriptions, send the command INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

bdates
Bruce might remember, but the paragraph is related to Box's famous quote, "All models are wrong, but some are useful." This is clearly a case of a useful model, even though it's incorrect.

Brian

From: SPSSX(r) Discussion <[hidden email]> on behalf of Jeff A <[hidden email]>
Sent: Thursday, April 22, 2021 4:29 PM
To: [hidden email] <[hidden email]>
Subject: Re: Multivariate Normality Test in SPSS
 
Interesting quote by Box - I have that one saved. Interesting to me how many
folks seem to mistakenly believe that IVs and DVs have to be normally
distributed for OLS regression, that missing data have to be MAR, and I
don't know how many other similar things that are misleading at best and are
often just simply wrong. I just questioned a grad student of mine who
performed some complex transformation of a variable and did some complex
thing that was very difficult to follow just to get results that were nearly
the exact same thing as simple OLS with variables in the original metric
when the discussion of the findings was to simply note whether something was
positively or negatively related to the outcome.

Jeff


-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Bruce
Weaver
Sent: Thursday, April 22, 2021 10:59 PM
To: [hidden email]
Subject: Re: Multivariate Normality Test in SPSS

The official answer from IBM (as of April 2020) appears to be NO. 

https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fpages%2Fdoes-ibm-spss-statistics-offer-test-multiv&amp;data=04%7C01%7Cbdates%40SWSOL.ORG%7Cc3cc6c7311aa40423c3208d905cd4f97%7Cecdd61640dbd4227b0986de8e52525ca%7C0%7C0%7C637547202791567156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=3fAmI0k62NbrSpIFbYAvPhuzXPoyptQSn9EZSPA1gI8%3D&amp;reserved=0
ariate-normality

Why do you want to test for multivariate normality?  If you're working with
real world data, approximate normality is the best you can hope for in any
case--at least if you believe what George Box famously said about the normal
distribution.


"In applying mathematics to subjects such as physics or statistics we make
tentative assumptions about the real world which we know are false but which
we believe may be useful nonetheless. The physicist knows that particles
have mass and yet certain results, approximating what really happens, may be
derived from the assumption that they do not. Equally, the statistician
knows, for example, that in nature there never was a normal distribution,
there never was a straight line, yet with normal and linear assumptions,
known to be false, he can often derive results which match, to a  useful
approximation, those found in the real world."  

https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmkweb.bcgsc.ca%2Fpointsofsignificance%2Fimg%2FBoxonmaths.pdf&amp;data=04%7C01%7Cbdates%40SWSOL.ORG%7Cc3cc6c7311aa40423c3208d905cd4f97%7Cecdd61640dbd4227b0986de8e52525ca%7C0%7C0%7C637547202791567156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=drceY9ejsHn%2FAXyK4xla2ZdX1wH3d2lmUlXE9qyBhg4%3D&amp;reserved=0



3J LEMA wrote
> How can we test multivariate normality in SPSS?
> Any suggestions?
>
> Thank you. Regards.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to
> manage subscriptions, send the command INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsites.google.com%2Fa%2Flakeheadu.ca%2Fbweaver%2F&amp;data=04%7C01%7Cbdates%40SWSOL.ORG%7Cc3cc6c7311aa40423c3208d905cd4f97%7Cecdd61640dbd4227b0986de8e52525ca%7C0%7C0%7C637547202791567156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0hEDRQK2hz%2B7PNMgRUomRl0WkAhKd3eYd9qyMJVQWe4%3D&amp;reserved=0

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspssx-discussion.1045642.n5.nabble.com%2F&amp;data=04%7C01%7Cbdates%40SWSOL.ORG%7Cc3cc6c7311aa40423c3208d905cd4f97%7Cecdd61640dbd4227b0986de8e52525ca%7C0%7C0%7C637547202791567156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6mSnqyHz18GRltOiAkxyYSQI7vJPKSEpY72AGCyiqEU%3D&amp;reserved=0

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Bruce Weaver
Administrator
I did not remember seeing that particular version of "all models are wrong"
in the Science and Statistics article, but I decided to check just now to be
sure.  "all models are wrong" appears twice.  

--- start of excerpt ---

2.3 Parsimony

Since all models are wrong the scientist cannot obtain a "correct" one by
excessive elaboration. On the contrary following William of Occam he should
seek an economical description of natural phenomena. Just as the ability to
devise simple but evocative models is the signature of the great scientist
so overelaboration and overparameterization is  often the mark of
mediocrity.

2.4 Worrying Selectively

Since all models are wrong the scientist must be alert to what is
importantly wrong. It is inappropriate to be concerned about mice when there
are tigers abroad.

--- end of excerpt ---

I had forgotten that one about "mice when there are tigers abroad".  It
reminds me of what Box said about testing for homogeneity of variance prior
to ANOVA or a t-test:


“To make the preliminary test on variances [before running a t-test or
ANOVA] is rather like putting to sea in a rowing boat to find out whether
conditions are sufficiently calm for an ocean liner to leave port!”

    George Box, Biometrika 1953;40:318–35.
 

I love that, but I would add that the more variable the sample sizes are,
the more likely I would be to use a test that is robust to heterogeneity of
variances (e.g., Satterthwaite t-test, Welch or Brown-Forsythe F-test if
using ONEWAY, etc.).  



bdates wrote
> Bruce might remember, but the paragraph is related to Box's famous quote,
> "All models are wrong, but some are useful." This is clearly a case of a
> useful model, even though it's incorrect.
>
> Brian
> ________________________________





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Art Kendall
All models are reasoning by analogy. In other words imperfect and often
useful.  (Models are like the shadows in Plato's cave.)
Early planetary orbits were modeled circles which was right *to some
degree*.

For early Egyptians treating the delta land as flat (a plane) resulted in
good enough restoration of property boundaries.

"Taller people are heavier than shorter people" is true as an overall
description.

Ockham's razor is commonly useful.


Aristotle pointed out the fallacy of precision.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Art Kendall
Committing the invidious median split is an instance of an overly simple
measurement model.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Alejandro González Heras
Dear community,

What an interesting topic!! So many interesting points you all have made. I have annotated some of those quotes/readings you mention to check them further.

I have been working with regressions lately and since I ''need'' the ''model to be valid'' I have been dealing with the assumptions. After your comments, I might see with another eyes the expression ''goodness of fit''.

I have been reading Andy Field (who is quite funny) for a general reading and, after that, Andrew Gelman for regression purposes (who makes very good points regarding causality and prediction by the way).

So, having said that, the purpose of the message is, If I may, to ask for your knowledge: what other readings/articles would you recommend for multivariate analysis (in social science)? Of course, there are just too many techniques for them to be in one book. I am specially interested in regressions (logistic and multilevel) and factorial/cluster analysis


PS: I just notice that today is Saint George. I don't know about you but here in Spain people gives roses to their love or dear ones and also gift them a book. I don't know if I am breaking tradition by 'requesting' a reading. Apologies to St George for that. As for all of you, let me send you a rose (link to imgbb.com): https://ibb.co/CtdgJ1H

All the best,
Alejandro



-----Mensaje original-----
De: SPSSX(r) Discussion <[hidden email]> En nombre de Art Kendall
Enviado el: viernes, 23 de abril de 2021 0:14
Para: [hidden email]
Asunto: Re: Multivariate Normality Test in SPSS

Committing the invidious median split is an instance of an overly simple measurement model.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Bruce Weaver
Administrator
Jola Alejandro, y feliz Diada de Sant Jordi.  (I hope I have expressed it
correctly!)  

This article by David Freedman might interest you:

Freedman, D. A. (1991). Statistical models and shoe leather. Sociological
methodology, 291-313.

If you have institutional access to JSTOR, you can get it here:  

https://www.jstor.org/stable/270939?seq=1#metadata_info_tab_contents

Cheers,
Bruce



Alejandro González Heras wrote

> Dear community,
>
> What an interesting topic!! So many interesting points you all have made.
> I have annotated some of those quotes/readings you mention to check them
> further.
>
> I have been working with regressions lately and since I ''need'' the
> ''model to be valid'' I have been dealing with the assumptions. After your
> comments, I might see with another eyes the expression ''goodness of
> fit''.
>
> I have been reading Andy Field (who is quite funny) for a general reading
> and, after that, Andrew Gelman for regression purposes (who makes very
> good points regarding causality and prediction by the way).
>
> So, having said that, the purpose of the message is, If I may, to ask for
> your knowledge: what other readings/articles would you recommend for
> multivariate analysis (in social science)? Of course, there are just too
> many techniques for them to be in one book. I am specially interested in
> regressions (logistic and multilevel) and factorial/cluster analysis
>
>
> PS: I just notice that today is Saint George. I don't know about you but
> here in Spain people gives roses to their love or dear ones and also gift
> them a book. I don't know if I am breaking tradition by 'requesting' a
> reading. Apologies to St George for that. As for all of you, let me send
> you a rose (link to imgbb.com): https://ibb.co/CtdgJ1H
>
> All the best,
> Alejandro





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Multivariate Normality Test in SPSS

Alejandro González Heras
I do have access

Thank you very much Bruce!! :)



El 23 abr. 2021 16:01, Bruce Weaver <[hidden email]> escribió:
Jola Alejandro, y feliz Diada de Sant Jordi.  (I hope I have expressed it
correctly!) 

This article by David Freedman might interest you:

Freedman, D. A. (1991). Statistical models and shoe leather. Sociological
methodology, 291-313.

If you have institutional access to JSTOR, you can get it here: 

https://www.jstor.org/stable/270939?seq=1#metadata_info_tab_contents

Cheers,
Bruce



Alejandro González Heras wrote
> Dear community,
>
> What an interesting topic!! So many interesting points you all have made.
> I have annotated some of those quotes/readings you mention to check them
> further.
>
> I have been working with regressions lately and since I ''need'' the
> ''model to be valid'' I have been dealing with the assumptions. After your
> comments, I might see with another eyes the expression ''goodness of
> fit''.
>
> I have been reading Andy Field (who is quite funny) for a general reading
> and, after that, Andrew Gelman for regression purposes (who makes very
> good points regarding causality and prediction by the way).
>
> So, having said that, the purpose of the message is, If I may, to ask for
> your knowledge: what other readings/articles would you recommend for
> multivariate analysis (in social science)? Of course, there are just too
> many techniques for them to be in one book. I am specially interested in
> regressions (logistic and multilevel) and factorial/cluster analysis
>
>
> PS: I just notice that today is Saint George. I don't know about you but
> here in Spain people gives roses to their love or dear ones and also gift
> them a book. I don't know if I am breaking tradition by 'requesting' a
> reading. Apologies to St George for that. As for all of you, let me send
> you a rose (link to imgbb.com): https://ibb.co/CtdgJ1H
>
> All the best,
> Alejandro





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD