t-test on a table

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

t-test on a table

John F Hall

Is there a way in SPSS of running t-test on, or generating a data set from, an actual table?

 

Fictitious data on cancer survival rates in mean days:

 

                                                Mean

                                N             days       sd

Treatment A       25           186         11

Treatment B       25           203         13

 

Associated question:

In a population with a certain type of cancer, the failure rate of treatment X is 15%.  Researchers wish to draw 2 equal size samples; one will receive treatment X the other treatment Y.  How large do the two samples need to be to reduce the failure rate to 12% at 0.05 sig and 0.95 confidence?

 

A sample of 1000 receives treatment Z, with a failure rate reduced to 10%: is this significant?

 

John F Hall MA (Cantab) Dip Ed (Dunelm)

IBM-SPSS Academic Author 9900074

 

Email: [hidden email]

Website: Journeys in Survey Research

Course: Survey Analysis Workshop (SPSS)

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: t-test on a table

Jon Peck
CTABLES compare column means test could be used here.  Or, if you  just have the aggregates, you could use the SPSSINC SUMMARY TTEST extension command, which can be installed from the Extensions > Extension Hub menu.  Here is the syntax for your example.

SPSSINC SUMMARY TTEST N1=25 MEAN1=286 SD1=22 LABEL1="Treatment A" N2=35 MEAN2=203 SD2=13
    LABEL2="Treatment B" CI=95.

It does the test with and without the equal variance assumption and provides asymptotic and exact confidence intervals.

Spoiler alert: in this example the difference is highly significant.

On Sat, May 2, 2020 at 2:09 AM Joihn F Hall <[hidden email]> wrote:

Is there a way in SPSS of running t-test on, or generating a data set from, an actual table?

 

Fictitious data on cancer survival rates in mean days:

 

                                                Mean

                                N             days       sd

Treatment A       25           186         11

Treatment B       25           203         13

 

Associated question:

In a population with a certain type of cancer, the failure rate of treatment X is 15%.  Researchers wish to draw 2 equal size samples; one will receive treatment X the other treatment Y.  How large do the two samples need to be to reduce the failure rate to 12% at 0.05 sig and 0.95 confidence?

 

A sample of 1000 receives treatment Z, with a failure rate reduced to 10%: is this significant?

 

John F Hall MA (Cantab) Dip Ed (Dunelm)

IBM-SPSS Academic Author 9900074

 

Email: [hidden email]

Website: Journeys in Survey Research

Course: Survey Analysis Workshop (SPSS)

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: t-test on a table

Bruce Weaver
Administrator
In reply to this post by John F Hall
Re the first question, use ONEWAY with matrix input and compute the square
root of F if you need to report it as a t-test.

* Oneway ANOVA using summary data.
DATA LIST LIST / ROWTYPE_ (a8) grp (f5.0) VARNAME_ (a15) Y (f8.2) .
BEGIN DATA
"MEAN"     1  "Treatment A"   186
"STDDEV"   1  "Treatment A"   11
"N"        1  "Treatment A"   25
"MEAN"     2  "Treatment B"   203
"STDDEV"   2  "Treatment B"   13
"N"        2  "Treatment B"   25
END DATA.

ONEWAY Y BY grp / MATRIX = in(*) /
 STATISTCS = DESCRIPTIVES WELCH BROWNFORSYTHE.

* If you need to report it as a t-test, t = SQRT(F).


I'm not sure I understand the "associated question" you posted.  Are you
asking what sample size is needed to detect the difference between 15% and
12% (assuming equal sample sizes in the two groups)?  If so, I don't have
access to the SPSS module for sample size estimation.  But using Stata,
here's what I get.  Change to a fixed font to make the table line up
properly.  

. power twoproportions 0.15 0.12, test(chi2) power(0.8 0.9 0.95)

Performing iteration ...

Estimated sample sizes for a two-sample proportions test
Pearson's chi-squared test
Ho: p2 = p1  versus  Ha: p2 != p1

  +-----------------------------------------------------------------+
  |   alpha   power       N      N1      N2   delta      p1      p2 |
  |-----------------------------------------------------------------|
  |     .05      .8   4,072   2,036   2,036    -.03     .15     .12 |
  |     .05      .9   5,450   2,725   2,725    -.03     .15     .12 |
  |     .05     .95   6,740   3,370   3,370    -.03     .15     .12 |
  +-----------------------------------------------------------------+


The final question about treatment Z is also unclear.  What do you want to
compare to the 10% of 1000?  

Cheers,
Bruce


John F Hall wrote

> Is there a way in SPSS of running t-test on, or generating a data set
> from,
> an actual table?
>
>  
>
> Fictitious data on cancer survival rates in mean days:
>
>  
>
>                                                 Mean
>
>                                 N             days       sd
>
> Treatment A       25           186         11
>
> Treatment B       25           203         13
>
>  
>
> Associated question:
>
> In a population with a certain type of cancer, the failure rate of
> treatment
> X is 15%.  Researchers wish to draw 2 equal size samples; one will receive
> treatment X the other treatment Y.  How large do the two samples need to
> be
> to reduce the failure rate to 12% at 0.05 sig and 0.95 confidence?
>
>  
>
> A sample of 1000 receives treatment Z, with a failure rate reduced to 10%:
> is this significant?
>
>  
>
> John F Hall MA (Cantab) Dip Ed (Dunelm)
>
> IBM-SPSS Academic Author 9900074
>
>  
>
> Email:

> johnfhall@

>  &lt;mailto:

> johnfhall@

> &gt;  
>
> Website: Journeys in Survey Research
> &lt;https://surveyresearch.weebly.com/&gt; 
>
> Course: Survey Analysis Workshop (SPSS)
> &lt;https://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.html&gt; 
>
>  
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: t-test on a table

John F Hall

Bruce’s solution is much better than mine.

 

I did a pretty clumsy workaround.

 

Generated a data set with 50 cases, 25 with sample =1 and 25 with sample = 2.

 

Couldn’t get exact Ns and SDs, but after about 40 attempts got pretty close.

 

if sample = 1 days = TRUNC(RV.UNIFORM(160, 201)).

if sample = 2 days =  TRUNC(RV.UNIFORM(209, 253)).

means days by sample.

 

 

Report

days 

sample

Mean

N

Std. Deviation

1

179.76

25

11.181

2

230.60

25

13.435

Total

205.18

50

28.443

 

then:

 

t-test groups sample (1,2)

/MISSING=LISTWISE

  /VARIABLES=days

  /CRITERIA=CI(.90).

 

 

From: [hidden email] <[hidden email]>
Sent: 02 May 2020 13:01
To: 'Maddison hughes' <[hidden email]>
Subject: t-test again

 

Maddy

As close as I can get:

 

Report

days 

sample

Mean

N

Std. Deviation

1

179.76

25

11.181

2

230.60

25

13.435

Total

205.18

50

28.443

 

 

Group Statistics

 

sample

N

Mean

Std. Deviation

Std. Error Mean

days

1

25

179.76

11.181

2.236

2

25

230.60

13.435

2.687

 

0.90

Independent Samples Test

 

Levene's Test for Equality of Variances

t-test for Equality of Means

F

Sig.

t

df

Sig. (2-tailed)

Mean Difference

Std. Error Difference

90% Confidence Interval of the Difference

Lower

Upper

days

Equal variances assumed

1.674

.202

-14.543

48

.000

-50.840

3.496

-56.703

-44.977

Equal variances not assumed

 

 

-14.543

46.468

.000

-50.840

3.496

-56.707

-44.973

 

0.95

 

 

Independent Samples Test

 

Levene's Test for Equality of Variances

t-test for Equality of Means

F

Sig.

t

df

Sig. (2-tailed)

Mean Difference

Std. Error Difference

95% Confidence Interval of the Difference

Lower

Upper

days

Equal variances assumed

1.674

.202

-14.543

48

.000

-50.840

3.496

-57.869

-43.811

Equal variances not assumed

 

 

-14.543

46.468

.000

-50.840

3.496

-57.875

-43.805

 

 

 

 

John F Hall MA (Cantab) Dip Ed (Dunelm)

IBM-SPSS Academic Author 9900074

 

Email: [hidden email]

Website: Journeys in Survey Research

Course: Survey Analysis Workshop (SPSS)

 

 

 

-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Bruce Weaver
Sent: 02 May 2020 19:45
To: [hidden email]
Subject: Re: t-test on a table

 

Re the first question, use ONEWAY with matrix input and compute the square root of F if you need to report it as a t-test.

 

* Oneway ANOVA using summary data.

DATA LIST LIST / ROWTYPE_ (a8) grp (f5.0) VARNAME_ (a15) Y (f8.2) .

BEGIN DATA

"MEAN"     1  "Treatment A"   186

"STDDEV"   1  "Treatment A"   11

"N"        1  "Treatment A"   25

"MEAN"     2  "Treatment B"   203

"STDDEV"   2  "Treatment B"   13

"N"        2  "Treatment B"   25

END DATA.

 

ONEWAY Y BY grp / MATRIX = in(*) /

STATISTCS = DESCRIPTIVES WELCH BROWNFORSYTHE.

 

* If you need to report it as a t-test, t = SQRT(F).

 

 

I'm not sure I understand the "associated question" you posted.  Are you asking what sample size is needed to detect the difference between 15% and 12% (assuming equal sample sizes in the two groups)?  If so, I don't have access to the SPSS module for sample size estimation.  But using Stata, here's what I get.  Change to a fixed font to make the table line up properly. 

 

. power twoproportions 0.15 0.12, test(chi2) power(0.8 0.9 0.95)

 

Performing iteration ...

 

Estimated sample sizes for a two-sample proportions test Pearson's chi-squared test

Ho: p2 = p1  versus  Ha: p2 != p1

 

  +-----------------------------------------------------------------+

  |   alpha   power       N      N1      N2   delta      p1      p2 |

  |-----------------------------------------------------------------|

  |     .05      .8   4,072   2,036   2,036    -.03     .15     .12 |

  |     .05      .9   5,450   2,725   2,725    -.03     .15     .12 |

  |     .05     .95   6,740   3,370   3,370    -.03     .15     .12 |

  +-----------------------------------------------------------------+

 

 

The final question about treatment Z is also unclear.  What do you want to compare to the 10% of 1000? 

 

Cheers,

Bruce

 

 

John F Hall wrote

> Is there a way in SPSS of running t-test on, or generating a data set

> from, an actual table?

>

>

> Fictitious data on cancer survival rates in mean days:

>

>

>                                                 Mean

>

>                                 N             days       sd

>

> Treatment A       25           186         11

>

> Treatment B       25           203         13

>

>

> Associated question:

>

> In a population with a certain type of cancer, the failure rate of

> treatment X is 15%.  Researchers wish to draw 2 equal size samples;

> one will receive treatment X the other treatment Y.  How large do the

> two samples need to be to reduce the failure rate to 12% at 0.05 sig

> and 0.95 confidence?

>

>

> A sample of 1000 receives treatment Z, with a failure rate reduced to 10%:

> is this significant?

>

>

> John F Hall MA (Cantab) Dip Ed (Dunelm)

>

> IBM-SPSS Academic Author 9900074

>

>

> Email:

 

> johnfhall@

 

>  &lt;mailto:

 

> johnfhall@

 

> &gt;

>

> Website: Journeys in Survey Research

> &lt;https://surveyresearch.weebly.com/&gt;

>

> Course: Survey Analysis Workshop (SPSS)

> &lt;https://surveyresearch.weebly.com/1-survey-analysis-workshop-spss.

> html&gt;

>

>

>

> =====================

> To manage your subscription to SPSSX-L, send a message to

 

> [hidden email]

 

>  (not to SPSSX-L), with no body text except the command. To leave the

> list, send the command SIGNOFF SPSSX-L For a list of commands to

> manage subscriptions, send the command INFO REFCARD

 

 

 

 

 

-----

--

Bruce Weaver

[hidden email]

http://sites.google.com/a/lakeheadu.ca/bweaver/

 

"When all else fails, RTFM."

 

NOTE: My Hotmail account is not monitored regularly.

To send me an e-mail, please use the address shown above.

 

--

Sent from: http://spssx-discussion.1045642.n5.nabble.com/

 

=====================

To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: t-test on a table

Rich Ulrich
In reply to this post by John F Hall
As to generating variables with specific means and SDs -
Generate the variables with the shape of distribution that
you  want.  For survival times, negative exponential might
be apt.

If the generating doesn't allow you to specify mean and SD...
Then use Descriptives to z-score by group;
multiple, for each group separately, by its desired SD; and
add on the desired Mean. 

Bruce handled the (obscurely worded) question about power.

--
Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of Joihn F Hall <[hidden email]>
Sent: Saturday, May 2, 2020 4:09 AM
To: [hidden email] <[hidden email]>
Subject: t-test on a table
 

Is there a way in SPSS of running t-test on, or generating a data set from, an actual table?

 

Fictitious data on cancer survival rates in mean days:

 

                                                Mean

                                N             days       sd

Treatment A       25           186         11

Treatment B       25           203         13

 

Associated question:

In a population with a certain type of cancer, the failure rate of treatment X is 15%.  Researchers wish to draw 2 equal size samples; one will receive treatment X the other treatment Y.  How large do the two samples need to be to reduce the failure rate to 12% at 0.05 sig and 0.95 confidence?

 

A sample of 1000 receives treatment Z, with a failure rate reduced to 10%: is this significant?

 

John F Hall MA (Cantab) Dip Ed (Dunelm)

IBM-SPSS Academic Author 9900074

 

Email: [hidden email]

Website: Journeys in Survey Research

Course: Survey Analysis Workshop (SPSS)

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: t-test on a table

Bruce Weaver
Administrator
I think I read too quickly the first time, and didn't notice that the context
is survival times.  Rich is right on the money when he says the
distributions of survival times will be positively skewed.  If you did have
the raw data, you'd almost certainly want to use some kind of survival (or
time-to-event) analysis.  I cannot think of anyway to do that from summary
statistics.  

I suppose that if it was a situation where all subjects experienced the
"event", I might consider using quantile regression.  SPSS 26 has a new
QUANTILE REGRESSION command.  If you don't have v26, you can use an
R-extension command IIRC.  

HTH.


Rich Ulrich wrote

> As to generating variables with specific means and SDs -
> Generate the variables with the shape of distribution that
> you  want.  For survival times, negative exponential might
> be apt.
>
> If the generating doesn't allow you to specify mean and SD...
> Then use Descriptives to z-score by group;
> multiple, for each group separately, by its desired SD; and
> add on the desired Mean.
>
> Bruce handled the (obscurely worded) question about power.
>
> --
> Rich Ulrich





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: t-test on a table

Rich Ulrich
This might be irrelevant, but, anyway, speaking as a suspicious
data analyst:

Pushing the "survival data" aspect a little more - John did say
that these were fictitious data. but the reported SDs are far too
small to be realistic if these were based on any proper survivorships.

A difference of more that 1.3 SD does not take a large N to detect.


=== from John's post

Fictitious data on cancer survival rates in mean days:

 

                                                Mean

                                N             days       sd

Treatment A       25           186         11

Treatment B       25           203         13

=== end

I do remember, once upon a time, seeing tiny, unrealistic SDs in a
paper submitted for publication.  Looking really closely at the text,
I found out that they were Standard Errors. Multiplied by the square
root of N, they became plausible as SDs.

 --
Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of Bruce Weaver <[hidden email]>
Sent: Saturday, May 2, 2020 5:09 PM
To: [hidden email] <[hidden email]>
Subject: Re: t-test on a table
 
I think I read too quickly the first time, and didn't notice that the context
is survival times.  Rich is right on the money when he says the
distributions of survival times will be positively skewed.  If you did have
the raw data, you'd almost certainly want to use some kind of survival (or
time-to-event) analysis.  I cannot think of anyway to do that from summary
statistics. 

I suppose that if it was a situation where all subjects experienced the
"event", I might consider using quantile regression.  SPSS 26 has a new
QUANTILE REGRESSION command.  If you don't have v26, you can use an
R-extension command IIRC. 

HTH.


Rich Ulrich wrote
> As to generating variables with specific means and SDs -
> Generate the variables with the shape of distribution that
> you  want.  For survival times, negative exponential might
> be apt.
>
> If the generating doesn't allow you to specify mean and SD...
> Then use Descriptives to z-score by group;
> multiple, for each group separately, by its desired SD; and
> add on the desired Mean.
>
> Bruce handled the (obscurely worded) question about power.
>
> --
> Rich Ulrich





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD