Login  Register

Re: Ratio of Cases to Regression Variables

Posted by Swank, Paul R on Mar 07, 2007; 5:11pm
URL: http://spssx-discussion.165.s1.nabble.com/Ratio-of-Cases-to-Regression-Variables-tp1074357p1074358.html

Such "ratios" are generally not very helpful. The issue is power and
that will depend on the effect size as well as the sample size. So the #
or subjects per variable will depend on the R squared for the full
model. The lower it is, the more subjects per variable will be needed.
The effect size in regression, F-squared is defined in terms of the
squared change attributable to a variable divided by 1 minus the R
squared for the full model. In the table below, the change in R squared
is set at .075. The R squareds for the full models range from .10 to
.50. This leads to F squareds that range from small/medium to medium
(Cohen, 1988). The n's range from 40 to 120. With 4 Ivs, this is 10 to
30 subjects per IV. As you can see, 10 subjects per IV is not enough
regardles of the Full model R squared. However, an R-squared for the
full model of .7 would give an F squared of .25 and apower of .84. With
20 subjects per IV, a full model R-squared of .3 is sufficient to give a
power > .8, and with 30 per IV, the power is > .9 even if the full model
r-sqared is only .10.

                                             power analyses for
regression model
                                            four predictors - RSQ_change
= .075

                            Obs    n_total    alpha    u     df
f_square     lambda     power

                              1       40       0.05    1     35
0.00000     0.0000    0.05000
                              2       40       0.05    1     35
0.08333     3.0833    0.40057
                              3       40       0.05    1     35
0.09375     3.4688    0.44099
                              4       40       0.05    1     35
0.10714     3.9643    0.49061
                              5       40       0.05    1     35
0.12500     4.6250    0.55230
                              6       40       0.05    1     35
0.15000     5.5500    0.62965
                              7       80       0.05    1     75
0.08333     6.4167    0.70561
                              8       80       0.05    1     75
0.09375     7.2188    0.75562
                              9       80       0.05    1     75
0.10714     8.2500    0.80931
                             10       80       0.05    1     75
0.12500     9.6250    0.86488
                             11       80       0.05    1     75
0.15000    11.5500    0.91845
                             12      120       0.05    1    115
0.08333     9.7500    0.87210
                             13      120       0.05    1    115
0.09375    10.9688    0.90728
                             14      120       0.05    1    115
0.10714    12.5357    0.93954
                             15      120       0.05    1    115
0.12500    14.6250    0.96654
                             16      120       0.05    1    115
0.15000    17.5500    0.98589


Paul R. Swank, Ph.D.
Professor, Developmental Pediatrics
Director of Research,


University of Texas Health Science Center at Houston

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Will Bailey [Statman]
Sent: Wednesday, March 07, 2007 10:14 AM
To: [hidden email]
Subject: Ratio of Cases to Regression Variables

I know this has come up but can't find the reference or consensus
statement:

There is a "generally accepted" ratio of cases to the number of IVs in a
regression, I believe it was something like 30 to 1 but not sure.

Anyone recall or can offer insight.

Tks,
W