An objection to the screening approach below is that there
might exist suppressor variables. These have no direct association
with the dependent variable, and will therefore be excluded
by the screening procedure. However, if included, they lead
to improved prediction and sharper understanding of those
variables that have direct effects on the target. There is a
literature on this. Backissues of The American Statistician have
some good expository articles.
Tony Babinec
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Antoon Smulders
Sent: Tuesday, February 01, 2011 3:12 AM
To: [hidden email]
Subject: overfitting in explorative studies
Hello list,
From Michael A. Babyak I cite: “One very common way of selecting variables for regression models is to look at the univariate relation between each variable and the response, and then to cull only those variables significant for entry into the subsequent regression analysis”. Babyak objects to such a procedure, but his article mainly seems to address hypothesis testing studies.
However I would like to ask the experts on this list, if it is allowed and/or useful to do this in an explorative context. Your opinions are welcome.
Antoon Smulders
Free forum by Nabble | Edit this page |