|
I'm a little confused. So, multicollinearity is a problem that can affect our regression results when the independent variables are correlated with each other. But many times, I see regression models like this:
y = B0 + B1 *Factor1 + B2 * (Factor1)^squared So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus resulting in a big collinearity problem? Any ideas why its ok here? Thanks. |
|
I believe that it is the combination of the linear and squared variable
that together give you the curvilinear effect of the variable. You are not interested or able to look only at the linear effect when the quadratic is in the equation. You can only evaluate the squared effect. matt Matthew Pirritano, Ph.D. Research Analyst IV County of Orange Medical Services Initiative (MSI) [hidden email] (714) 834-6011 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: Friday, July 18, 2008 2:33 PM To: [hidden email] Subject: Multicollinearity confusion I'm a little confused. So, multicollinearity is a problem that can affect our regression results when the independent variables are correlated with each other. But many times, I see regression models like this: y = B0 + B1 *Factor1 + B2 * (Factor1)^squared So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus resulting in a big collinearity problem? Any ideas why its ok here? Thanks. -- View this message in context: http://www.nabble.com/Multicollinearity-confusion-tp18538040p18538040.ht ml Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by jimjohn
The problem, Jim John, arises not exactly when the independent variables are
correlated, but when they are (1) linearly correlated, and (2) the correlation is nearly 1. Between a variable and its square there is no linear correlation, except perhaps an approximately linear correlation for small ranges of variation. The real problem is, to be more precise, that no independent variable can be a perfect linear function of the rest of independent variables. Imagine, for instance, having one variable called TODAY, another variable DATEOFBIRTH, and a third variable AGETODAY. One of them is redundant. In that hypothetical case, one of the independent variables would be redundant, and the matrix of covariances would be singular (i.e. will have a zero determinant). Since computing the coefficients of regression involves dividing by that determinant, it would involve dividing by zero, and no real solution would exist. When the determinant is NEARLY zero, such as 0.000000001, a small change in any of the variables may cause large changes in the estimated coefficients, leading to unstable solutions. Moderate (or even relatively high) correlations among independent variables do not have this effect, and can be tolerated. The TOLERANCE criterion in the REGRESSION command (available in STEPWISE methods for instance) is used to decide whether or not to accept a new variable in the equation. The TOLERANCE criterion sets up the minimum value required for the determinant, below which a new variable is not included because it would cause practical multi-collinearity i.e. a very unstable solution. Hope this clarifies the issue. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: 18 July 2008 18:33 To: [hidden email] Subject: Multicollinearity confusion I'm a little confused. So, multicollinearity is a problem that can affect our regression results when the independent variables are correlated with each other. But many times, I see regression models like this: y = B0 + B1 *Factor1 + B2 * (Factor1)^squared So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus resulting in a big collinearity problem? Any ideas why its ok here? Thanks. -- View this message in context: http://www.nabble.com/Multicollinearity-confusion-tp18538040p18538040.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Actually, the correlation between x and x squared is around .97 for
values of x between 0 and 10. This can get worse as you add x cubed and higher quadratics. Thus, we often suggest centering such variables before powering them for use in such analyses. Paul R. Swank, Ph.D. Professor and Director of Research Children's Learning Institute University of Texas Health Science Center - Houston -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: Friday, July 18, 2008 5:15 PM To: [hidden email] Subject: Re: Multicollinearity confusion The problem, Jim John, arises not exactly when the independent variables are correlated, but when they are (1) linearly correlated, and (2) the correlation is nearly 1. Between a variable and its square there is no linear correlation, except perhaps an approximately linear correlation for small ranges of variation. The real problem is, to be more precise, that no independent variable can be a perfect linear function of the rest of independent variables. Imagine, for instance, having one variable called TODAY, another variable DATEOFBIRTH, and a third variable AGETODAY. One of them is redundant. In that hypothetical case, one of the independent variables would be redundant, and the matrix of covariances would be singular (i.e. will have a zero determinant). Since computing the coefficients of regression involves dividing by that determinant, it would involve dividing by zero, and no real solution would exist. When the determinant is NEARLY zero, such as 0.000000001, a small change in any of the variables may cause large changes in the estimated coefficients, leading to unstable solutions. Moderate (or even relatively high) correlations among independent variables do not have this effect, and can be tolerated. The TOLERANCE criterion in the REGRESSION command (available in STEPWISE methods for instance) is used to decide whether or not to accept a new variable in the equation. The TOLERANCE criterion sets up the minimum value required for the determinant, below which a new variable is not included because it would cause practical multi-collinearity i.e. a very unstable solution. Hope this clarifies the issue. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: 18 July 2008 18:33 To: [hidden email] Subject: Multicollinearity confusion I'm a little confused. So, multicollinearity is a problem that can affect our regression results when the independent variables are correlated with each other. But many times, I see regression models like this: y = B0 + B1 *Factor1 + B2 * (Factor1)^squared So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus resulting in a big collinearity problem? Any ideas why its ok here? Thanks. -- View this message in context: http://www.nabble.com/Multicollinearity-confusion-tp18538040p18538040.ht ml Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by mpirritano
Matt,
Can you recommend a reference on the interpretation of nonlinear effect, particularly quadratic amd cubic? Thanks. --- On Sat, 7/19/08, Pirritano, Matthew <[hidden email]> wrote: From: Pirritano, Matthew <[hidden email]> Subject: Re: Multicollinearity confusion To: [hidden email] Date: Saturday, July 19, 2008, 6:07 AM I believe that it is the combination of the linear and squared variable that together give you the curvilinear effect of the variable. You are not interested or able to look only at the linear effect when the quadratic is in the equation. You can only evaluate the squared effect. matt Matthew Pirritano, Ph.D. Research Analyst IV County of Orange Medical Services Initiative (MSI) [hidden email] (714) 834-6011 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: Friday, July 18, 2008 2:33 PM To: [hidden email] Subject: Multicollinearity confusion I'm a little confused. So, multicollinearity is a problem that can affect our regression results when the independent variables are correlated with each other. But many times, I see regression models like this: y = B0 + B1 *Factor1 + B2 * (Factor1)^squared So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus resulting in a big collinearity problem? Any ideas why its ok here? Thanks. -- View this message in context: http://www.nabble.com/Multicollinearity-confusion-tp18538040p18538040.ht ml Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I've found this sage publication "Interaction effects in multiple regression" Jaccard & Turrisi, 2003 to be helpful in understanding how to interpret both linear interactions as well as interactions involving a variable raised to a power.
A quick perusal of Tabachnick and Fidel points to the following source for more info on understanding nonlinear effects in regression. Aiken & West, 1991, Multiple Regression: Testing and Interpreting Interactions. Matt Matthew Pirritano, Ph.D. Research Analyst IV County of Orange Medical Services Initiative (MSI) [hidden email] (714) 834-6011 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Johnny Amora Sent: Friday, July 18, 2008 11:37 PM To: [hidden email] Subject: Re: Multicollinearity confusion Matt, Can you recommend a reference on the interpretation of nonlinear effect, particularly quadratic amd cubic? Thanks. --- On Sat, 7/19/08, Pirritano, Matthew <[hidden email]> wrote: From: Pirritano, Matthew <[hidden email]> Subject: Re: Multicollinearity confusion To: [hidden email] Date: Saturday, July 19, 2008, 6:07 AM I believe that it is the combination of the linear and squared variable that together give you the curvilinear effect of the variable. You are not interested or able to look only at the linear effect when the quadratic is in the equation. You can only evaluate the squared effect. matt Matthew Pirritano, Ph.D. Research Analyst IV County of Orange Medical Services Initiative (MSI) [hidden email] (714) 834-6011 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: Friday, July 18, 2008 2:33 PM To: [hidden email] Subject: Multicollinearity confusion I'm a little confused. So, multicollinearity is a problem that can affect our regression results when the independent variables are correlated with each other. But many times, I see regression models like this: y = B0 + B1 *Factor1 + B2 * (Factor1)^squared So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus resulting in a big collinearity problem? Any ideas why its ok here? Thanks. -- View this message in context: http://www.nabble.com/Multicollinearity-confusion-tp18538040p18538040.ht ml Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ======= To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
thanks for all the answers and advice guys! just one more question: I have one model with the following three variables: (VIRM, VIRM^2, and BARate). Now, VIRM on its own has an inverse effect on my dependent variable, and so does VIRM^2. In the model when all these variables are grouped together, VIRM^2 has a positive coefficient. Is this ok? Or does this indicate some kind of problem with this model? thanks!
|
|
Jim John,
I do not think it represents a problem. The univariate negative effect of VIRM^2 is probably due to the increasing effect of VIRM as such. Once the linear effect of VIRM is already taken account of, the quadratic effect is shown to be negative (perhaps because in the relevant range of variation of VIRM the quadratic function is decreasing, or simply because the non-linear component is an attenuation of the linear effect). Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: 22 July 2008 16:05 To: [hidden email] Subject: Re: Multicollinearity confusion thanks for all the answers and advice guys! just one more question: I have one model with the following three variables: (VIRM, VIRM^2, and BARate). Now, VIRM on its own has an inverse effect on my dependent variable, and so does VIRM^2. In the model when all these variables are grouped together, VIRM^2 has a positive coefficient. Is this ok? Or does this indicate some kind of problem with this model? thanks! Pirritano, Matthew-2 wrote: > > I've found this sage publication "Interaction effects in multiple > regression" Jaccard & Turrisi, 2003 to be helpful in understanding how to > interpret both linear interactions as well as interactions involving a > variable raised to a power. > > A quick perusal of Tabachnick and Fidel points to the following source for > more info on understanding nonlinear effects in regression. Aiken & West, > 1991, Multiple Regression: Testing and Interpreting Interactions. > > Matt > > Matthew Pirritano, Ph.D. > Research Analyst IV > County of Orange > Medical Services Initiative (MSI) > [hidden email] > (714) 834-6011 > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Johnny Amora > Sent: Friday, July 18, 2008 11:37 PM > To: [hidden email] > Subject: Re: Multicollinearity confusion > > Matt, > > Can you recommend a reference on the interpretation of nonlinear effect, > particularly quadratic amd cubic? > > Thanks. > > --- On Sat, 7/19/08, Pirritano, Matthew <[hidden email]> wrote: > > From: Pirritano, Matthew <[hidden email]> > Subject: Re: Multicollinearity confusion > To: [hidden email] > Date: Saturday, July 19, 2008, 6:07 AM > > I believe that it is the combination of the linear and squared variable > that together give you the curvilinear effect of the variable. You are > not interested or able to look only at the linear effect when the > quadratic is in the equation. You can only evaluate the squared effect. > > matt > > > Matthew Pirritano, Ph.D. > Research Analyst IV > County of Orange > Medical Services Initiative (MSI) > [hidden email] > (714) 834-6011 > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > jimjohn > Sent: Friday, July 18, 2008 2:33 PM > To: [hidden email] > Subject: Multicollinearity confusion > > I'm a little confused. So, multicollinearity is a problem that can > affect our > regression results when the independent variables are correlated with > each > other. But many times, I see regression models like this: > y = B0 + B1 *Factor1 + B2 * (Factor1)^squared > > So, wouldn't Factor 1 and (Factor 1)^squared be highly correlated, thus > resulting in a big collinearity problem? Any ideas why its ok here? > Thanks. > -- > View this message in context: > http://www.nabble.com/Multicollinearity-confusion-tp18538040p18538040.ht > ml > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > > ======= > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > -- View this message in context: http://www.nabble.com/Multicollinearity-confusion-tp18538040p18596414.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
