|
Hello Listers, Is there a way in SPSS to do a t-test for independent samples when I only have the mean, SD and sample size of the two groups (not the actual data)? If so, how? In advance, thanks you.
Jean Hanson |
|
|
In reply to this post by Jean Hanson
There is certainly some way to get SPSS to do such an analysis, but why?
The simplest approach is to get the formula from any handy undergrad stat book and use a hand calculator.
Michael
**************************************************** Michael Granaas [hidden email] Assoc. Prof. Phone: 605 677 5295 Dept. of Psychology FAX: 605 677 3195 University of South Dakota 414 E. Clark St. Vermillion, SD 57069 ***************************************************** From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jean Hanson [[hidden email]] Sent: Monday, January 25, 2010 3:14 PM To: [hidden email] Subject: t-test with summary data Hello Listers, Is there a way in SPSS to do a t-test for independent samples when I only have the mean, SD and sample size of the two groups (not the actual data)? If so, how? In advance, thanks you.
Jean Hanson |
|
Administrator
|
In reply to this post by Dale Glaser
But also note Marta's comment at the top of that file:
* I send you some syntax to perform a T test for independent samples with summary data. Although you already have a solution for that (with matrix data input in ANOVA), this method is far more complete: Here's the less complete, but possibly easier method (i.e., one-way ANOVA with matrix input): http://www.spsstools.net/Syntax/T-Test/DoT-TestWithOnlyMeansSDandNs.txt If you need to report it as a t-test rather than an F-test, t = SQRT(F).
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
In reply to this post by Granaas, Michael
Dale Glaser provided a link on the spsstools.net website which
has some code
written, I believe, by Marta GG. My reaction to all of
that code was two-fold:
(1) couldn't the t-test itself be computed with less code (I
admit that the available
code does provide more info than the
t-test result alone)
and
(2) it reminds me of NASA's attempt to come up with a ball
point pen that would
write under conditions of zero gravity; they spent some
ridiculous amount
of money (over a million $ or so) while the Russian space
program solved the
problem by providing cosmonauts with pencils.
Doing a t-test using summary information is one of the
problems I require
undergraduates to do with a hand calculator in my statistics
classes and I
imagine anyone with any statistical training should be able to
do the same.
I cannot imagine why one would SPSS to do
it unless one has a larger number
of cases (i.e., pairs of means, SD, and N)
or would have to do so repetitively
over time, in which case the spsstools code would be quite useful.
Still, for a single case, it probably took longer to write the
code than to
solve the problem by hand.
-Mike Palij
New York University
|
|
In reply to this post by Granaas, Michael
I agree that under many circumstances it is easier to do this by hand.
I too show and have students use hand methods to develop understanding.
On the other hand, if one's work is part of an audit, evaluation, or forensic study, the quality assurance "referencer's" (reviewer's) job is a lot easier if standard software such as SPSS is used. So they are exposed to good work habits, I usually have each student develop a set of syntax and review one or two others students' approach. I encourage users at all levels to use the QA approach of having someone else go over the syntax and output even for homework. It also makes the analyst's job a lot easier, when (s)he inevitably goes back and refines or corrects part of the whole job approach. With today's machines it is not usually very time consuming to revisit the whole analysis so that there is a single listing where all/most of the warnings etc on earlier runs are corrected or the situations are specifically addressed in the final syntax stream, (e.g., if system missing values are created on a run, later runs have as much missing data as is possible be user missing, and warnings are commented on right in the syntax.) Art Kendall Social Research Consultants On 1/25/2010 4:23 PM, Granaas, Michael wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
|
I'd like to make the following points concerning the use of
SPSS for
*problems like a t-test result from summary
data".
(1) The syntax programming by Marta is remarkably
complete and
comes close to duplicating the results from the SPSS t-test
procedure.
As a test of a student's ability to write SPSS syntax, I think
that this
is a good test of how knowledgeable a
student is about SPSS syntax
and conventions -- asking a student to re-create this code
would be
a worthy exercise. A better exercise would be to extend
this to allow
analysis of multiple "cases" (i.e., pairs of means, SDs, and
Ns).
(2) I agree that it is important to document one's work,
especially
SPSS code to allow inspect by "independent parties" as well
as
checking for problems in analysis. I always wondered how
many
people went back and double-checked their statistical
analyses
after Leland Wikinson and Gerry Dallal (1977) published their
famous
paper on underflow/overflow errors in the calculations by
mainframe
statistical packages that would lead to erroneous means,
variances,
correlations, etc. (SPSS was one of the packages that failed
their
test; BMDP, a favorite of mine at the time, passed because it
read
through the data twice so its calculations even in single
precision
were correct -- double-precision for calculations and other
modifications
became necessary after that point). For people with www.jstor.org
access, the stable URL for the article is:
http://www.jstor.org/stable/2682964 How many dissertations and research reports unknowingly
had
incorrect results (though this might have been avoided if the
old
advice of running any serious analysis through two
different
statistical packages to see if they produced the same
results).
By the way, the issue of accuracy of statistical software has
not
gone away (just look at the issues associated with Excel),
as
documented in part by McCullough; again, for those with
Jstor
access, see:
http://www.jstor.org/stable/2685736 and for McCullough on Excel, see: www.forecastingprinciples.com/files/McCullough.pdf
(3) One of the big problems with Marta's syntax and which limits it's usefulness is that it can only be applied to a single "case", that is, one pair of means, SDs, and sample size. One really has to ask why such a situation has come up? I can understand writing this syntax if one is dealing with a bunch of "cases", say, data extracted from printed sources that only provide summary data and one wants to get t-test results across these "cases". Such an activity may occur in the context of a quantitative summary of results in an area (though one might then ask why one isn't doing a proper meta-analysis after a systematic reviews of the lit). For a single "case", if one is lazy or incurious, surely someone has written a java applet that's available on some website that can produce the same result?
My concern is that people with limited knowledge of statistics will actually think this is a good use of SPSS instead of realizing that "hey, I can do this by hand". Again, the question of what do you want to use in outer space under zero gravity: a ballpoint pen or a pencil? -Mike Palij
New York University
----- Original Message -----
|
|
Hi everybody:
I just came back from my classes and had a funny time witnessing all the fuss about that code I wrote years ago. Some comments on it: 1) First of all, it was written as a simple exercise of programming, the challenge of being able to mimic SPSS T test output with summary data. I enjoyed it. Period. 2) Of course a hand calculator, Excel, or a cute little freeware program called Simcalc (I recommend it to my sutdents) can be used for the task, but this code was designed as a very simple tool for some people at the Universitary Clinic of Navarra, who, while working as reviewers of scientific papers, wanted sometimes to check the result of t test, WITHOUT having to use a calculator (they are simply not in the mood of playing with numbers by hand). 3) Although computing the t statistic is fairly easy (agreed), remembering to compute also a homogeneity of variances (HOV) test, and adjusting the degrees of freedom by Brown&Forsythe formula if HOV condition is NOT as easy (I don't remember the formula by heart, for instance). 4) I always tell my students that 95%CI are as important, or even MORE, than simple p-values. If people is a bit lazy, the 95%CI will not be computed by hand. 5) The code can be very easily modified to work with several rows of data (see below). Perhaps my mistake when I first posted the code was assuming that such a simple task was no challenge for any syntax writer, even with no MACRO programming knowledge. Modified code: - First, save everything from MATRIX to END MATRIX to disc (I have used 'C:/Temp/MATRIX T Test.sps' as destination). Just before the last line, I have added this line: PRINT /TITLE='-------------------------------------'. It has cosmetic purposes (separating one analysis from the next). - Second, modify slightly the dataset (add a new variable called 'TestNr', and split the file by it): data list list /TestNr(F8) mean1(f8.3) sd1(F8.3) n1(F8.0) mean2(f8.3) sd2(F8.3) n2(F8.0). begin data 1 187.643 38.098 14 235.929 54.286 14 2 187.643 38.098 20 235.929 54.286 20 3 187.643 38.098 10 235.929 54.286 10 4 187.643 38.098 45 235.929 54.286 45 end data. SPLIT FILE SEPARATE BY TestNr. - Now, include C:/Temp/MATRIX T Test.sps INCLUDE FILE='C:/Temp/Matrix T Test.sps'. - Finally, the last portion of the code (the report) is run, after un-splitting the file: * Computation of exact (non asymptotic) 95%CI for diff *. SPLIT FILE OFF. COMPUTE low1 = diff -eedif1* IDF.T(0.975,df1) . COMPUTE upp1 = diff +eedif1* IDF.T(0.975,df1) . COMPUTE low2 = diff -eedif2* IDF.T(0.975,df2) . COMPUTE upp2 = diff +eedif2* IDF.T(0.975,df2) . FORMAT TestNr(F8). REPORT FORMAT=LIST AUTOMATIC ALIGN(CENTER) /VARIABLES=TestNr low1 upp1 /TITLE "95%CI for diff assuming equal variances". REPORT FORMAT=LIST AUTOMATIC ALIGN(CENTER) /VARIABLES=TestNr low2 upp2 /TITLE "95%CI for diff not assuming equal variances". Now the code can be used for as many rows of data as you want. HTH, Marta GG -- For miscellaneous SPSS related statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
