An objection to the screening approach below is that there
might exist suppressor variables. These have no direct association
with the dependent variable, and will therefore be excluded
by the screening procedure. However, if included, they lead
to improved prediction and sharper understanding of those
variables that have direct effects on the target. There is a
literature on this. Backissues of The American Statistician have
some good expository articles.
Tony Babinec
<mailto:tbabinec@sbcglobal.net> tbabinec@sbcglobal.net
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Antoon Smulders
Sent: Tuesday, February 01, 2011 3:12 AM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: overfitting in explorative studies
Hello list,
From Michael A. Babyak
<
http://www.psychosomaticmedicine.org/cgi/reprint/66/3/411> I cite: "One
very common way of selecting variables for regression models is to look at
the univariate relation between each variable and the response, and then to
cull only those variables significant for entry into the subsequent
regression analysis". Babyak objects to such a procedure, but his article
mainly seems to address hypothesis testing studies.
However I would like to ask the experts on this list, if it is allowed
and/or useful to do this in an explorative context. Your opinions are
welcome.
Antoon Smulders