SPSSINC SUMMARY TTEST appears to use critical z when computing CI

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

SPSSINC SUMMARY TTEST appears to use critical z when computing CI

Bruce Weaver
Administrator
Hello folks.  While working on a tutorial to have students compute an unpaired t-test "by hand" using SPSS, I decided to tack on at the end code for SPSSINC SUMMARY TTEST.  The 95% CI for the mean difference from my own "hand" calculations matched the CI from T-TEST carried out on the raw data; but the CI from SPSSINC SUMMARY TTEST did not match.  That made me wonder if SPSSINC SUMMARY TTEST might be using a critical z instead of a critical t when computing the CI.  When I did my own "hand" calculation using a critical z, I got a CI matching very closely that from SPSSINC SUMMARY TTEST.  Full details are provided below in the syntax.  Perhaps whoever maintains SPSSINC SUMMARY TTEST could take a look at the underlying code.  (Or if this has already been fixed, perhaps some of us need to reinstall SPSSINC SUMMARY TTEST?)  

Cheers,
Bruce

* ================================================================ .
*  File:    Test_SPSSINC_SUMMARY_TTEST.sps
*  Date:    10-Aug-2022
*  Author:  Bruce Weaver, bweaver@lakeheadu.ca
* ================================================================ .

* Note:  This code was developed using SPSS 28.0.1.1(14) for Windows.

NEW FILE.
DATASET CLOSE ALL.

* Let's use the hotdog example from this document:
* https://www.statstutor.ac.uk/resources/uploaded/unpaired-t-test.pdf.

DATA LIST FREE / Calories (F5.0).
BEGIN DATA
186, 181, 176, 149, 184, 190, 158, 139, 175, 148, 152, 111, 141, 153, 190, 157, 131, 149, 135, 132,
129, 132, 102, 106, 94, 102, 87, 99, 170, 113, 135, 142, 86, 143, 152, 146, 144
END DATA.
DATASET NAME raw.
COMPUTE Group = ($CASENUM GT 20) + 1.
FORMATS Group(F1).
VALUE LABELS Group 1 "Beef" 2 "Poultry".
MEANS Calories BY Group.

* Use OMS to direct results from the MEANS command  
* to a new dataset called Means.

* OMS.
DATASET DECLARE Means.
OMS
  /SELECT TABLES
  /IF COMMANDS=['Means'] SUBTYPES=['Report']
  /DESTINATION FORMAT=SAV NUMBERED=Table
     OUTFILE='Means' VIEWER=YES.

* Use a Means command after your OMS command to generate
* the mean and SD for calories by group.
* Follow that command with OMSEND to turn off OMS.

MEANS calories BY Group.
OMSEND.

* Activate the Means dataset & compute the t-test "by hand".

DATASET ACTIVATE Means.
LIST.
* Rename variable Std.Deviation to SD.
RENAME VARIABLES (Std.Deviation = SD).
* Drop the Total row.
SELECT IF Var1 NE "Total".
LIST.
* Compute a new variable to flag the last row (where the Beef mean is shown).
COMPUTE lastrow = Var1 EQ "Poultry".
FORMATS lastrow (F1).
* Let variable SS = the sum of squares.
* HINT:  Variance = SS/(n-1), and SD = SQRT(Variance).
COMPUTE SS = (n-1)*SD**2.
LIST.

* Use a DO IF - END IF structure to create the following variables
* on the last row.
DO IF lastrow.
 * Let MeanDiff = the Beef mean minus the Poultry mean.
 COMPUTE MeanDiff = LAG(Mean) - Mean.
 * Let SSwg = SS1 + SS2.
 COMPUTE SSwg = LAG(SS) + SS.
 * Let n1 = n for the Poultry group & n2 = n for the Beef group.
 COMPUTE n1 = LAG(N).
 COMPUTE n2 = N.
 * Let df = n1 + n2 - 2.
 COMPUTE df = n1 + n2 - 2.
END IF.

COMPUTE SDpooled = SQRT(SSwg/df).
COMPUTE SE = SDpooled*SQRT(1/n1 + 1/n2).
COMPUTE delta0 = 0.
COMPUTE tobs = (MeanDiff-delta0)/SE.
COMPUTE pval = 2*CDF.T(-ABS(tobs),df).
COMPUTE tcrit = IDF.T(.975,df).
COMPUTE LL95 = MeanDiff - SE*tcrit.
COMPUTE UL95 = MeanDiff + SE*tcrit.
FORMATS MeanDiff SE tobs pval LL95 UL95 tcrit (F8.4) / df (F5.0).
LIST N Mean SD MeanDiff SE tobs df pval.
LIST tcrit MeanDiff LL95 UL95.

* Now use the T-TEST command with the raw data.
DATASET ACTIVATE raw.
T-TEST GROUPS=Group(1 2)
  /MISSING=ANALYSIS
  /VARIABLES=calories
  /ES DISPLAY(FALSE)
  /CRITERIA=CI(.95).

* Finally, use the SPSSINC SUMMARY TTEST module.

DATASET ACTIVATE Means.
FORMATS Mean SD (F8.5).
LIST Var1 N Mean SD.
 
SPSSINC SUMMARY TTEST
 N1=20 MEAN1=156.8500 SD1=22.64201 LABEL1="Beef"
 N2=17 MEAN2=122.4706 SD2=25.48313 LABEL2="Poultry" CI=95.

* NOTE that the T-TEST command using raw data and my "by hand"
* calculations both show the 95% CI as 18.318 to 50.441.
* But SPSSINC SUMMARY TTEST shows it as 18.873 to 49.885.

* Q. Does SPSSINC SUMMARY TTEST use a critical z instead of a critical t?.
COMPUTE zcrit = PROBIT(0.975).
COMPUTE LL95z = MeanDiff - SE*1.96.
COMPUTE UL95z = MeanDiff + SE*1.96.
FORMATS zcrit LL95z UL95z (F8.3).
LIST zcrit LL95z UL95z.
* A. Yes, it does seem to be using a critical value of z
* rather than a critical value of t.
* I have not checked what the unequal variances version does,
* nor have I looked at the so-called exact CIs.

* ================================================================ .
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: SPSSINC SUMMARY TTEST appears to use critical z when computing CI

jkpeck
Using your data, I ran the standard t test procedure and SUMMARY TTEST.  The results were exactly the same up to 5 significant figures.

Note that SUMMARY TTEST shows two sets of CI numbers.  The first set, labelled Asymptotic, and, hence is based on the normality assumption, but the second set, labelled Exact, matches the standard ttest.  Both are displayed for didactic reasons.
Reply | Threaded
Open this post in threaded view
|

Re: SPSSINC SUMMARY TTEST appears to use critical z when computing CI

Bruce Weaver
Administrator
Ah, I see.  I just did not understand what was intended by "asymptotic" and "exact".  And if I had looked more carefully at the "exact" CI, I would have noticed that it matched the ordinary (equal variances) CI.  Thanks Jon.  



jkpeck wrote
Using your data, I ran the standard t test procedure and SUMMARY TTEST.  The results were exactly the same up to 5 significant figures.

Note that SUMMARY TTEST shows two sets of CI numbers.  The first set, labelled Asymptotic, and, hence is based on the normality assumption, but the second set, labelled Exact, matches the standard ttest.  Both are displayed for didactic reasons.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).