Multicollinearity

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Multicollinearity

Jims More
I need your expert opinion.
 
I am doing a predictive model.  My DV is a dichotomous (enrolled/not enrolled) while my IVs are categorical and continuous.  The categorical IVs have categories ranging from 2 to 10.  There are 8 continuous variables.  I tried to use only the 8 continuous variables to model the DV and showed multicollinearity because many B coefficnets are negative, , tolerance are below .10, VIF greater than 50; condition index more than .100.  What is your advise?
 
Jims


Need a Holiday? Win a $10,000 Holiday of your choice. Enter now..
Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

Michael Pearmain-2
You should attempt to adjust for multicollinearity, since your regression coefficients are unstable and you typically interpret these coefficients. Some approaches to this.
 
  • Increase the sample size - this can be useful if there are many variables and relatively few cases. (In practice this is rarely feasible).
  • Combine multiple indicators in some fashion. (How ? average, optimal weights via factor or principal components analysis).
  • Exclude redundant variables (in other words use one as a proxy).
  • Estimate multiple indicator models using specialty programs (SPSS makes AMOS).
  • Use prior knowledge to specify the value of a coefficient, or the ratio of two coefficients.
  • Use principal components regression - run principal components analysis (in FACTOR) to reduce your highly correlated variables into several “component” variables, and use these variables in the regression analysis.
  • Use a biased estimation technique, such as ridge regression (an SPSS macro to run ridge regression using matrix transformations accompanies the SPSS).
HtH

Mike




On Mon, May 18, 2009 at 5:01 AM, Jims More <[hidden email]> wrote:
>
> I need your expert opinion.
>  
> I am doing a predictive model.  My DV is a dichotomous (enrolled/not enrolled) while my IVs are categorical and continuous.  The categorical IVs have categories ranging from 2 to 10.  There are 8 continuous variables.  I tried to use only the 8 continuous variables to model the DV and showed multicollinearity because many B coefficnets are negative, , tolerance are below .10, VIF greater than 50; condition index more than .100.  What is your advise?
>  
> Jims
> ________________________________
> Need a Holiday? Win a $10,000 Holiday of your choice. Enter now..


--
Michael Pearmain
Senior Analytics Research Specialist

Statistician: A man who believes figures don't lie, but admits that under analysis some of them won't stand up either.


Google UK Ltd
Belgrave House
76 Buckingham Palace Road
London SW1W 9TQ
United Kingdom
t +44 (0) 2032191684  
[hidden email]

If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

Jims More
In reply to this post by Jims More
Thank you, Mike.
 
It would be easier to handle if I am only interested of whether or not the IVs are statistically significant.  It would be difficult, however, since I would use my model to predict the values of DV for another set of IVs in the future.
 
Jims

--- On Mon, 18/5/09, Michael Pearmain <[hidden email]> wrote:

From: Michael Pearmain <[hidden email]>
Subject: Re: Multicollinearity
To: "Jims More" <[hidden email]>
Cc: [hidden email]
Received: Monday, 18 May, 2009, 10:56 AM

You should attempt to adjust for multicollinearity, since your regression coefficients are unstable and you typically interpret these coefficients. Some approaches to this.
 
  • Increase the sample size - this can be useful if there are many variables and relatively few cases. (In practice this is rarely feasible).
  • Combine multiple indicators in some fashion. (How ? average, optimal weights via factor or principal components analysis).
  • Exclude redundant variables (in other words use one as a proxy).
  • Estimate multiple indicator models using specialty programs (SPSS makes AMOS).
  • Use prior knowledge to specify the value of a coefficient, or the ratio of two coefficients.
  • Use principal components regression - run principal components analysis (in FACTOR) to reduce your highly correlated variables into several “component” variables, and use these variables in the regression analysis.
  • Use a biased estimation technique, such as ridge regression (an SPSS macro to run ridge regression using matrix transformations accompanies the SPSS).
HtH

Mike




On Mon, May 18, 2009 at 5:01 AM, Jims More <morejims@...> wrote:
>
> I need your expert opinion.
>  
> I am doing a predictive model.  My DV is a dichotomous (enrolled/not enrolled) while my IVs are categorical and continuous.  The categorical IVs have categories ranging from 2 to 10.  There are 8 continuous variables.  I tried to use only the 8 continuous variables to model the DV and showed multicollinearity because many B coefficnets are negative, , tolerance are below .10, VIF greater than 50; condition index more than .100.  What is your advise?
>  
> Jims
> ________________________________
> Need a Holiday? Win a $10,000 Holiday of your choice. Enter now..


--
Michael Pearmain
Senior Analytics Research Specialist

Statistician: A man who believes figures don't lie, but admits that under analysis some of them won't stand up either.


Google UK Ltd
Belgrave House
76 Buckingham Palace Road
London SW1W 9TQ
United Kingdom
t +44 (0) 2032191684  
mpearmain@...

If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks.



Need a Holiday? Win a $10,000 Holiday of your choice. Enter now..
Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

Kooij, A.J. van der
In reply to this post by Jims More

If you have the SPSS Categories add-on module you could use CATREG (Analyze menu, Regression, Optimal Scaling), which is a procedure for regression with categorical variables, but it can handle continuous variables as well. For continuous predictors choose discretization option multiply and numeric optimal scaling level; for the categorical predictors choose optimal scaling level nominal or ordinal.

If you want to explore whether continuous predictors are nonlinearly related to the DV choose one of the spline optimal scaling levels.

Since version 17.0 CATREG offers options for regularized regression: Ridge regression, the Lasso, and the Elastic Net. Also, resampling options (cross validation and .632 bootstrap) are available to select the optimal regularized model.

 

Regards,

Anita van der Kooij

Data Theory Group

Leiden University



From: SPSSX(r) Discussion on behalf of Jims More
Sent: Mon 18-May-09 06:01
To: [hidden email]
Subject: Multicollinearity

I need your expert opinion.
 
I am doing a predictive model.  My DV is a dichotomous (enrolled/not enrolled) while my IVs are categorical and continuous.  The categorical IVs have categories ranging from 2 to 10.  There are 8 continuous variables.  I tried to use only the 8 continuous variables to model the DV and showed multicollinearity because many B coefficnets are negative, , tolerance are below .10, VIF greater than 50; condition index more than .100.  What is your advise?
 
Jims


Need a Holiday? Win a $10,000 Holiday of your choice. Enter now..

**********************************************************************

This email and any files transmitted with it are confidential and

intended solely for the use of the individual or entity to whom they

are addressed. If you have received this email in error please notify

the system manager.

**********************************************************************

 

Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

aysemel
In reply to this post by Jims More
Hi,

i have a very very urgent situation and i would be very glad if someone can help me.  I should do a ridge regression analyis for continuous variables on SPSS 20 since there my VIF values are greater than 5.  how can i do that? how can i interpret this? Could you please explain me the method and the steps that i should do. I'm confused if i would select spline nominal/ordinal or nomina, ordinal numeric ? How can i compare the results that i have from the linear regression and ridge regression? Can i get unstandardized coefficients for ridge regression? Is there a way for these issues other than writing a syntax in SPSS?  Can i do it by using CATREG menu? How can i show in such a multicollinearity case, ridge regression gives better results than the ordinary linear regression?
Reply | Threaded
Open this post in threaded view
|

Automatic reply: Multicollinearity

Mikki Haegle

I am off enjoying some vacation time.

I will return on June 13th and will respond to messages once I am back in the Lab.

 

~M

 

 

 

 

Mikki Haegle

Psychology Lab Coordinator

700 E 7th Street, NM-L202

St Paul, MN 55106

651.793.1354

 

Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

aysemel
In reply to this post by Kooij, A.J. van der

Thank you so much, this is a very very useful information for me. Could you please help me how can i compare the results that i have from the linear regression and ridge regression (for the same data set of course) and show that when there is multicollineairty, ridge regression performs better?
Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

SR Millis-3
Is this a homework assignment for a class?

SR Millis



From: aysemel <[hidden email]>
To: [hidden email]
Sent: Thursday, June 7, 2012 3:17 PM
Subject: Re: Multicollinearity

Thank you so much, this is a very very useful information for me. Could you
please help me how can i compare the results that i have from the linear
regression and ridge regression (for the same data set of course) and show
that when there is multicollineairty, ridge regression performs better?

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multicollinearity-tp1087821p5713575.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

aysemel

 a project that i need to solve for my master degree.