Running a Stepwise Regression Model

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Running a Stepwise Regression Model

Donna Daniels
I am having difficulty running the Stepwise Regression Model on my SPSS
14 package.  My dissertation research has seven independent variables
that could be related to each other causing multicollinearity issues.
To avoid multicollinearity, I chose the stepwise model.  Could someone
send me the command I need to use to make this work?

Thank You,

Donna Daniels
Doctoral Candidate

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Running a Stepwise Regression Model

SR Millis-3
Donna,

I would suggest that you avoid using stepwise variable
selection.  Stepwise method has severe problems in the
presence of collinearity.  The degree of correlation
between the predictor variables affects the frequency
with which authentic predictor variables find their
way into the final model.

To assess collinearity in SPSS in linear regression,
request "collinearity diagnostic" in the statistic
box.  Then, examine the condition indexes--and see
whether there are any that are >30. For those >30,
then examine the variance-decomposition
proportions--and look for any that are >.50. This will
allow you to identify those variables that have high
collinearity.

But there are several additional reasons to avoid
stepwise variable selection:

--It yields R-squared values that are badly biased
high.

--The F and chi-squared tests do not have the claimed
distribution.

--The method yields confidence intervals for effects
and predicted values that are falsely narrow (Altman &
Anderson, 1989).

--It yields P-values that do not have the proper
meaning and the proper correction for them is a
difficult problem.

--It gives biased regression coefficients that need
shrinkage: the coefficients for remaining variables
are too large (Tibshirani, 1996).

--It is based on methods (e.g., F tests for nested
models) that were intended to be used to test
prespecified hypotheses.

--Increasing the sample size doesn’t help very much
(Derksen & Keselman, 1992).

--The number of candidate predictor variables affects
the number of noise variables that gain entry to the
model.

Rather than using stepwise variable selection, you
need to do the thinking in model development---and not
let the computer do the thinking for you:

--Theory and knowledge of the content area should
first guide you in the initial selection of the
variables.

--Select variables that have wide score distributions.
 Variables having narrow ranges will have limited
variance and an attenuated capacity to detect
differences or detect associations.

--Consider eliminating variables that have or will
likely have high levels of missing data.

--Leave statistically nonsignificant predictor
variables in the model.  Taking out the nonsignificant
predictors and then re-fitting the models with only
the significant predictors produces a biased model.
Harrell (2001) noted that, “Leaving insignificant
predictors in the model increases the likelihood that
the confidence interval for the effect of interest has
the stated coverage” (p. 82).

--More advanced methods like Bayesian model averaging
can be helpful in model fitting and development, but I
don't think that BMA can be done in SPSS. I use R to
do BMA.

Scott Millis

References

Altman, DG & Andersen, PK  (1989).  Bootstrap
investigation of the stability of a Cox regression
model. Statistics in Medicine, 8, 771-783.

Copas, JB  (1983).  Regression, prediction and
shrinkage (with discussion). Journal of the Royal
Statistical Society, B45, 311-354.

Derksen, S & Keselman, HJ  (1992).  Backward, forward
and stepwise automated subset selection algorithms:
Frequency of obtaining authentic and noise variables.
British Journal of Mathematical and Statistical
Psychology, 45, 265-282.

Hurvich, CM & Tsai, CL  (1990).  The impact of model
selection on inference in linear regression.  American
Statistician, 44,214-217.

Roecker, EB  (1991).  Prediction error and its
estimation for subset--selected models. Technometrics,
33, 459-468.

Tibshirani, R  (1996).  Regression shrinkage and
selection via the lasso.  Journal of the Royal
Statistical Society, B58, 267-288.



--- Donna Daniels <[hidden email]> wrote:

> I am having difficulty running the Stepwise
> Regression Model on my SPSS
> 14 package.  My dissertation research has seven
> independent variables
> that could be related to each other causing
> multicollinearity issues.
> To avoid multicollinearity, I chose the stepwise
> model.  Could someone
> send me the command I need to use to make this work?
>
> Thank You,
>
> Donna Daniels
> Doctoral Candidate
>
> =====================
> To manage your subscription to SPSSX-L, send a
> message to
> [hidden email] (not to SPSSX-L), with no
> body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send
> the command
> INFO REFCARD
>


Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat
Professor & Director of Research
Dept of Physical Medicine & Rehabilitation
Wayne State University School of Medicine
261 Mack Blvd
Detroit, MI 48201
Email:  [hidden email]
Tel: 313-993-8085
Fax: 313-966-7682

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Running a Stepwise Regression Model

Art Kendall-2
In reply to this post by Donna Daniels
A stepwise regression is not usually a good idea.
some ideas to consider:
A. try using theory to reduce the number of predictors.

B. use some form of factor analysis (PCA or PAF) to get  a smaller
number of predictors.

C. consider a stepped/hierarchical approach.
for each variable show
1) the zero-order correlation with the DV. i.e., 1 predictor at a time.
(max fit)
2) the b&beta when all 7 variables are in the equation - 7 predictors.
(fit when all are in at the same time)
3) the change in fit and b&beta  when 6 variables are ENTERed in 1 step
and the 7th is ENTERed on the second step. Unique contribution to fit.

for syntax, case studies, and tutorials see the <help> command.

Art Kendall
Social Research Consultants


Donna Daniels wrote:

> I am having difficulty running the Stepwise Regression Model on my SPSS
> 14 package.  My dissertation research has seven independent variables
> that could be related to each other causing multicollinearity issues.
> To avoid multicollinearity, I chose the stepwise model.  Could someone
> send me the command I need to use to make this work?
>
> Thank You,
>
> Donna Daniels
> Doctoral Candidate
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD