SPSSX Discussion - Re: Power, estimated sample size and conundrum!

Re: Power, estimated sample size and conundrum!

Posted by Bruce Weaver on Jul 25, 2014; 7:34pm
URL: http://spssx-discussion.165.s1.nabble.com/Power-estimated-sample-size-and-conundrum-tp5726815p5726827.html

Here's a simulation I cobbled together. When I plug in 600 per group, I get power of around 80% for each of the methods CROSSTABS offers (as expected, given my sample size estimate obtained via Stata). When I plug in 300 per group, I get power of around 50%.

HTH.

* Simulate power analysis for comparing two
* independent proportions when the true
* proportions are .085 and .045.
* Use a macros to set the N per group
* and the number of iterations.

* Sample sizes of 600 per group were computed via
* Stata-- power twoproportions .085 .045 .

******************************** .
DEFINE !NperGrp () 600 !ENDDEFINE.
DEFINE !Iterations () 1000 !ENDDEFINE.
******************************** .

NEW FILE.
DATASET CLOSE all.

SET RNG=MT MTINDEX=RANDOM.

INPUT PROGRAM.
LOOP Table = 1 to !Iterations.
LEAVE Table.
LOOP row = 1 to 2.
LEAVE row.
LOOP col = 1 to 2.
END CASE.
END LOOP.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.

* Generate the data.

DO IF row EQ 1.
- IF col EQ 1 kount = RV.BINOM(!NperGrp, .085).
ELSE.
- IF col EQ 1 kount = RV.BINOM(!NperGrp, .045).
END IF.
IF MISSING(kount) kount = !NperGrp - LAG(kount).
EXECUTE.
FORMATS Table kount (F5.0) / row col (F1).

* OMS.
DATASET DECLARE Xtab.
OMS
/SELECT TABLES
/IF COMMANDS=['Crosstabs'] SUBTYPES=['Crosstabulation']
/DESTINATION FORMAT=SAV NUMBERED=TableNumber_
OUTFILE='Xtab' VIEWER=NO.
* OMS.
DATASET DECLARE ChiSq.
OMS
/SELECT TABLES
/IF COMMANDS=['Crosstabs'] SUBTYPES=['Chi Square Tests']
/DESTINATION FORMAT=SAV NUMBERED=TableNumber_
OUTFILE='ChiSq' VIEWER=NO.

WEIGHT by kount.
CROSSTABS row by col by Table / Stat = chisqr.
OMSEND.

DATASET ACTIVATE Chisq.
SELECT if VAR1 NE "Total".
EXECUTE.
COMPUTE Significant = Asymp.Sig.2sided LE .05.
IF MISSING(Significant) Significant = ExactSig.2sided LE .05.
FORMATS Significant (f1).

* Use MEANS procedure to show the proportion of
* tests that achieve statistical significance.
* Remember, the mean of a 1-0 variable is the
* proportion of cases equal to 1.

MEANS Significant by Var2 /cells = mean count.

* The Means show the power.
* The Linear-by-linear result is equaivalent to the N-1 chi-square
* (see http://www.iancampbell.co.uk/twobytwo/twobytwo.htm).
* NOTE: Ignore the Total row.

* When I set N per group = 600, I get power around .80.
* When I set N per group = 300, I get power around .50.

Dale Glaser wrote

Thank you Bruce....yes, that is exactly what I got with Stata when I initially ran the power analysis two years ago (i.e., n = 596 per group). However, the discussion I've been having is when one runs a z-test for two proportions (whether using the SPSS macro or Stata option) n = 300 per group will suffice to obtain significance for 4.5% vs. 8.5% (z = 1.99, p = .047).

I do understand that with power, in part being based on the noncentral distribution, in conjunction with the rigors of the desired power (e.g., .8) may make sample size estimates larger than what one needs to obtain significance via simulation (as I did with the SPSS macro for z test of proportions). However, it seems that there is a nontrivial difference between n = 300 per group sufficing to have p < .05 (per the z test) as opposed to n = 596 per group as per the power analysis (with desired power of .8, two tailed test, and alpha = .05)

So though I understand the statistical difference between conducting power and running the actual z-test, I'm having difficulties reconciling the large sample size differences when discussing this with the PI (and making the ultimate recommendation). Any feedback how you all broach the subject re: difference of sample size estimate via power analysis as opposed to simulation obtaining the actual test statistic (and p-value) would be much appreciated.

Thank you....Dale

Dale Glaser, Ph.D.
Principal--Glaser Consulting
Lecturer/Adjunct Faculty--SDSU/USD/Alliant
3115 4th Avenue
San Diego, CA 92103
phone: 619-220-0602
fax: 619-220-0412
email: [hidden email]
website: www.glaserconsult.com

________________________________
From: Bruce Weaver <[hidden email]>
To: [hidden email]
Sent: Thursday, July 24, 2014 6:38 PM
Subject: Re: Power, estimated sample size and conundrum!

Speaking of Stata, here's what I get for your original question (i.e., sample
size for proportions of .085 & .045):

. power twoproportions .085 .045

Performing iteration ...

Estimated sample sizes for a two-sample proportions test
Pearson's chi-squared test
Ho: p2 = p1 versus Ha: p2 != p1

Study parameters:

alpha = 0.0500
power = 0.8000
delta = -0.0400 (difference)
p1 = 0.0850
p2 = 0.0450

Estimated sample sizes:

N = 1192
N per group = 596

I found this site helpful in working out how to do that:

http://www.stata-press.com/manuals/power-sample-size-reference-manual/

HTH.

Dale Glaser wrote

> Greetings all....I have a
> question/situation that on the surface seems very transparent but I am
> having
> difficulties navigating. So any feedback will be much appreciated.
>
>
> For a study two years
> ago I conducted a power analysis comparing two proportions (p1 = .085 vs.
> p2 =
> .045). Whether I used G*Power, Stata, or Power and Precision they all
> gave me
> sample size estimates ranging from 600 to 640 (contingent if continuity
> correction was incorporated) per group for alpha = .05 and power of .80.
>
>
>
> However, when I ran the
> SPSS macro for z test of proportions for two groups comparing .045 vs.
> .085, I found that n = 300 per
> group was sufficient to obtain significance: z = -1.99, p = .047. I also
> confirmed this in Stata using the following syntax: prtesti 300 .045 300
> .085.
>
> Hence, can I assume that
> the difference (i.e., n = 600 per group per power analysis vs. n = 300 per
> group being sufficient to obtain significance) is a function of the
> desired
> power insofar with the conventional power of .80 you are setting the bar
> high
> enough so as to find the optimal nexus of Type I and Type II error?
>
> The conundrum is such
> that our study ended up with p1 = .027 vs. p2 =.047 and p = .194 with the
> recommended sample
> size being n = 300 per group given the simulation I ran in SPSS and
> Stata. Note that post hoc power analysis indicates I
> would have needed n = 1496 per group (alpha =.05, power = .80) to obtain
> significance
> for delta of 2%, though when I run .027 vs. .047 in Stata and SPSS with n
> = 685
> per group z = 1.96, p = .05.
>
> Anyway, this has become
> an interesting/challenging discussion with PI and reviewers alike. We
> went with the sample size (n = 300 per
> group) since that was sufficient for obtaining significance, but they are
> indicting we
> should have gone with the larger sample size based on the power estimate
> (i.e.,
> n = 600 per group).
>
> Has anyone encountered
> such a dilemma and how did you deal with it?
>
> Thank you….Dale
>
>
>
>
>
>
> Dale Glaser, Ph.D.
> Principal--Glaser Consulting
> Lecturer/Adjunct Faculty--SDSU/USD/Alliant
> 3115 4th Avenue
> San Diego, CA 92103
> phone: 619-220-0602
> fax: 619-220-0412
> email:

> glaserconsult@

> website: www.glaserconsult.com
>
>
> ________________________________
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Power-estimated-sample-size-and-conundrum-tp5726815p5726821.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).