I am confused of the linearity assumption in Multiple Linear Regression. I dont know which of the two is correct? 1) Is linearity would mean that EACH independent variable in the model should have linear relationship with the dependent? And, therefore, the linear relationship between each of the independent and dependent should be tested statistically. 2) Is linearity would mean that the PREDICTED DEPENDENT and the OBSERVED DEPENDENT should have linear relationship? And, therefore, in testing the linearity
one should just use the R-square and check if the regression ANOVA is significant? Any comment is welcome. |
Administrator
|
This post was updated on .
Unfortunately, there is not a lot of consistency in how authors talk about "linearity" in linear regression. Some describe a regression model as linear only if the functional relationship between X and Y is linear. For these folks, a model including both X and X-squared as predictors (to account for a U-shaped functional relationship) would likely be described as a polynomial regression.
But others emphasize that OLS linear regression models are "linear in the coefficients". So even if the functional relationship is not linear--e.g., X and X-squared as predictors--it is still properly described as a linear regression model (provided it's linear in the coefficients). The following web-page provides an interesting example: https://onlinecourses.science.psu.edu/stat501/node/235 The page title is "Polynomial Regression". But notice what the author says in the second paragraph: "As for a bit of semantics, it was noted at the beginning of the previous course how nonlinear regression (which we discuss later) refers to the nonlinear behavior of the coefficients, which are linear in polynomial regression. Thus, polynomial regression is still considered linear regression!" Having said all that, I think what you're concerned about is the possibility of non-linear functional relationships. One good way to check for that is by looking at residual plots. When you run your model, save the residuals. (Several types of residuals are available, and you may want to look at more than just the raw residuals. See the Help for details.) Then make scatter-plots with Y = residual, and X = fitted value of Y, or perhaps X = an explanatory variable of particular interest. Do a Google search on <residual plots> or <analysis of residuals> etc to find more info. See also this page by the same author cited above: https://onlinecourses.science.psu.edu/stat501/node/157 HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
I quite agree with Bruce on this.
That a linear model is linear is due to the fact that it is linear in each and every coefficient/model parameter. In other words, if you take the first partial derivative of the DV with respect to each coefficient/model parameter, you obtain a quantity that is constant with respect to the coefficient/model parameter. This even holds when the parameter is that for a quadratic/cubic term which models a nonlinear relationship between the DV and that predictor. Simply put, a linear model can model a nonlinear relationship between the DV and a predictor when the linearity assumption still holds. Hongwei Yang, University of Kentucky -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver Sent: Wednesday, April 16, 2014 9:27 AM To: [hidden email] Subject: Re: Linearity assumption in Multiple Regression Analysis Unfortunately, there is not a lot of consistency in how authors talk about "linearity" in linear regression. Some describe a regression model as linear only if the functional relationship between X and Y is linear. For these folks, a model including both X and X-squared as predictors (to account for a U-shaped functional relationship) would likely be described as a polynomial regression. But others emphasize that OLS linear regression models are "linear in the coefficients". So even if the functional relationship is not linear--e.g., X and X-squared as predictors--it is still properly described as a linear regression model (provided it's linear in the coefficients). The following web-page provides an interesting example: https://onlinecourses.science.psu.edu/stat501/node/235 The page title is "Polynomial Regression". But notice what the author says in the second paragraph: "As for a bit of semantics, it was noted at the beginning of the previous course how nonlinear regression (which we discuss later) refers to the nonlinear behavior of the coefficients, which are linear in polynomial regression. Thus, polynomial regression is still considered linear regression!" Having said all that, I think what you're concerned about is the possibility of non-linear functional relationships. One good way to check for that is by looking at residual plots. When you run your model, save the residuals. (Several types of residuals are available, and you may want to look at more than just the raw residuals. See the Help for details.) Then make scatter-plots with Y = residual, and X = fitted value of Y, or perhaps X = an explanatory variable of particular interest. Do a Google search on <residual plots> or <analysis of residuals> etc to find more info. HTH. E. Bernardo wrote > I am confused of the linearity assumption in Multiple Linear > Regression. I dont know which of the two is correct? > > 1) Is linearity would mean that EACH independent variable in the model > should have linear relationship with the dependent? And, therefore, > the linear relationship between each of the independent and dependent > should be tested statistically. > > 2) Is linearity would mean that the PREDICTED DEPENDENT and the > OBSERVED DEPENDENT should have linear relationship? And, therefore, in > testing the linearity one should just use the R-square and check if > the regression ANOVA is significant? > > Any comment is welcome. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Linearity-assumption-in-Multiple-Regression-Analysis-tp5725473p5725482.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by E. Bernardo
I guess that I think of "linearity" in *3* different contexts in
Multiple Linear Regression. There is the definitional part in the name, MLR; there is the underlying assumption, which is mis-used in your point (2); and there is the "generally good idea" which is mentioned in your point (1). As to the name: Bruce does a good job in summarizing. MLR includes polynomial regression. As to point (2): The first thing to say is that, Yes, the overall test is a test for whether there is some linear relationship. However, the discussion of the "linearity assumption" is usually concerned with the absence of *nonlinearity* and the choice of the correct model, after assuming or knowing that there is some relationship to start with. More on point (2): I would probably derive this or discuss this assumption of linearity under the topic of "homogeneity of variance of the regression residuals". Looking at residuals is useful, as Bruce also prescribes. If the observed do not have a symmetric distribution around the regression line, with similar variances for the whole line, it is at least true that the F-tests are not necessarily good since the "effective degrees of freedom" is not as large as the test-d.f. (You might see this, say, using two log-normal variables when you didn't take the logs of them - linear fit with unequal residuals.) If the general fit is not linear, then that implies that the residuals do not everywhere have a zero mean. Again, the test is bad; this time, there is sometimes a "fix" available through simple transformation. Contrary to what "high R-squared" aspect of point (2), nonlinearity of fit is often easiest to spot when there is a very good fit with a very high R-square. As to point (1): If you are not using polynomial regression with multiple transformations of some (independent variables) X, then, yes, it is generally nicer (because it will give results that are more robust) if you use a scaling of the X's that are each linear with the dependent Y. But that can be less important than keeping the model sensible, so someone might insist on keeping terms as "dollars" (for example) instead of transforming. -- Rich Ulrich Date: Wed, 16 Apr 2014 17:21:46 +0800 From: [hidden email] Subject: Linearity assumption in Multiple Regression Analysis To: [hidden email] I am confused of the linearity assumption in Multiple Linear Regression. I dont know which of the two is correct? 1) Is linearity would mean that EACH independent variable in the model should have linear relationship with the dependent? And, therefore, the linear relationship between each of the independent and dependent should be tested statistically. 2) Is linearity would mean that the PREDICTED DEPENDENT and the OBSERVED DEPENDENT should have linear relationship? And, therefore, in testing the linearity
one should just use the R-square and check if the regression ANOVA is significant? Any comment is welcome. |
Free forum by Nabble | Edit this page |