Posted by
Swank, Paul R on
Mar 07, 2007; 5:11pm
URL: http://spssx-discussion.165.s1.nabble.com/Ratio-of-Cases-to-Regression-Variables-tp1074357p1074358.html
Such "ratios" are generally not very helpful. The issue is power and
that will depend on the effect size as well as the sample size. So the #
or subjects per variable will depend on the R squared for the full
model. The lower it is, the more subjects per variable will be needed.
The effect size in regression, F-squared is defined in terms of the
squared change attributable to a variable divided by 1 minus the R
squared for the full model. In the table below, the change in R squared
is set at .075. The R squareds for the full models range from .10 to
.50. This leads to F squareds that range from small/medium to medium
(Cohen, 1988). The n's range from 40 to 120. With 4 Ivs, this is 10 to
30 subjects per IV. As you can see, 10 subjects per IV is not enough
regardles of the Full model R squared. However, an R-squared for the
full model of .7 would give an F squared of .25 and apower of .84. With
20 subjects per IV, a full model R-squared of .3 is sufficient to give a
power > .8, and with 30 per IV, the power is > .9 even if the full model
r-sqared is only .10.
power analyses for
regression model
four predictors - RSQ_change
= .075
Obs n_total alpha u df
f_square lambda power
1 40 0.05 1 35
0.00000 0.0000 0.05000
2 40 0.05 1 35
0.08333 3.0833 0.40057
3 40 0.05 1 35
0.09375 3.4688 0.44099
4 40 0.05 1 35
0.10714 3.9643 0.49061
5 40 0.05 1 35
0.12500 4.6250 0.55230
6 40 0.05 1 35
0.15000 5.5500 0.62965
7 80 0.05 1 75
0.08333 6.4167 0.70561
8 80 0.05 1 75
0.09375 7.2188 0.75562
9 80 0.05 1 75
0.10714 8.2500 0.80931
10 80 0.05 1 75
0.12500 9.6250 0.86488
11 80 0.05 1 75
0.15000 11.5500 0.91845
12 120 0.05 1 115
0.08333 9.7500 0.87210
13 120 0.05 1 115
0.09375 10.9688 0.90728
14 120 0.05 1 115
0.10714 12.5357 0.93954
15 120 0.05 1 115
0.12500 14.6250 0.96654
16 120 0.05 1 115
0.15000 17.5500 0.98589
Paul R. Swank, Ph.D.
Professor, Developmental Pediatrics
Director of Research,
University of Texas Health Science Center at Houston
-----Original Message-----
From: SPSSX(r) Discussion [mailto:
[hidden email]] On Behalf Of
Will Bailey [Statman]
Sent: Wednesday, March 07, 2007 10:14 AM
To:
[hidden email]
Subject: Ratio of Cases to Regression Variables
I know this has come up but can't find the reference or consensus
statement:
There is a "generally accepted" ratio of cases to the number of IVs in a
regression, I believe it was something like 30 to 1 but not sure.
Anyone recall or can offer insight.
Tks,
W