overfitting in explorative studies

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

overfitting in explorative studies

Antoon Smulders
Hello list,
 
From Michael A. Babyak I cite: “One very common way of selecting variables for regression models is to look at the univariate relation between each variable and the response, and then to cull only those variables significant for entry into the subsequent regression analysis”. Babyak objects to such a procedure, but his article mainly seems to address hypothesis testing studies.
However I would like to ask the experts on this list, if it is allowed and/or useful  to do this in an explorative context. Your opinions are welcome.
 
Antoon Smulders
 
Reply | Threaded
Open this post in threaded view
|

Re: overfitting in explorative studies

Anthony Babinec

 

An objection to the screening approach below is that there

might exist suppressor variables. These have no direct association

with the dependent variable, and will therefore be excluded

by the screening procedure. However, if included, they lead

to improved prediction and sharper understanding of those

variables that have direct effects on the target. There is a

literature on this. Backissues of The American Statistician have

some good expository articles.  

 

Tony Babinec

[hidden email]

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Antoon Smulders
Sent: Tuesday, February 01, 2011 3:12 AM
To: [hidden email]
Subject: overfitting in explorative studies

 

Hello list,

 

From Michael A. Babyak I cite: “One very common way of selecting variables for regression models is to look at the univariate relation between each variable and the response, and then to cull only those variables significant for entry into the subsequent regression analysis”. Babyak objects to such a procedure, but his article mainly seems to address hypothesis testing studies.

However I would like to ask the experts on this list, if it is allowed and/or useful  to do this in an explorative context. Your opinions are welcome.

 

Antoon Smulders

 

Reply | Threaded
Open this post in threaded view
|

Re: overfitting in explorative studies

Art Kendall
In reply to this post by Antoon Smulders
Off hand I cannot think of a situation where that would be advisable.  As to what else to do, lot depends on the nature of the variables and the nature of the phenomena under consideration.  For example, in some subject matter areas such as when the variables are attitude items, it is more conventional to explore whether variables can be used as items in a summative score.

It might be possible for members of this list to make more specific suggestions if you were to describe the situation.
What is the substantive nature of the research?
What is the nature of a case?  How were they selected?
How are your independent and dependent variables measured?
How and why was the data gathered?
What levels of measurement are there?
Are there subsets of the independent variables  which might be grouped with regard to semantics?
Are there substantively different subsets of cases? Attitudes, values, symptoms, etc.
Where does the current effort stand in the research agenda?
How complex might plausible models be?  Is the existence of suppressors, moderators, interactions plausible?


Art Kendall
Social Research Consultants



On 2/1/2011 4:12 AM, Antoon Smulders wrote:
Hello list,
 
From Michael A. Babyak I cite: “One very common way of selecting variables for regression models is to look at the univariate relation between each variable and the response, and then to cull only those variables significant for entry into the subsequent regression analysis”. Babyak objects to such a procedure, but his article mainly seems to address hypothesis testing studies.
However I would like to ask the experts on this list, if it is allowed and/or useful  to do this in an explorative context. Your opinions are welcome.
 
Antoon Smulders
 
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: overfitting in explorative studies

Bruce Weaver
Administrator
In reply to this post by Anthony Babinec
Good point about suppression.  Dave Howell has some notes that the OP might find useful.  Scroll down past the sections on moderation and mediation.

   http://www.uvm.edu/~dhowell/gradstat/psych341/lectures/MultipleRegression/multreg3.html

HTH.


Anthony Babinec wrote

An objection to the screening approach below is that there

might exist suppressor variables. These have no direct association

with the dependent variable, and will therefore be excluded

by the screening procedure. However, if included, they lead

to improved prediction and sharper understanding of those

variables that have direct effects on the target. There is a

literature on this. Backissues of The American Statistician have

some good expository articles.



Tony Babinec

 <mailto:tbabinec@sbcglobal.net> tbabinec@sbcglobal.net



From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Antoon Smulders
Sent: Tuesday, February 01, 2011 3:12 AM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: overfitting in explorative studies



Hello list,



From Michael A. Babyak
<http://www.psychosomaticmedicine.org/cgi/reprint/66/3/411>  I cite: "One
very common way of selecting variables for regression models is to look at
the univariate relation between each variable and the response, and then to
cull only those variables significant for entry into the subsequent
regression analysis". Babyak objects to such a procedure, but his article
mainly seems to address hypothesis testing studies.

However I would like to ask the experts on this list, if it is allowed
and/or useful  to do this in an explorative context. Your opinions are
welcome.



Antoon Smulders


--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).