I would be grateful for advice on the issues aorund testing for normal distribution of residuals, in both simple linear regression and multiple linear regression (in case thee should be a difference).
So I understand that contrary to the common misconception on should not really be testing whether the DV is normally distributed in a regression but rather whether the residuals (predicted minus observed values) are normally distributed. If that is the case, then what is the most appropriate way to test for that? Laerd statistics website (which I rely on a fair amount not being a statistician) advises that ''using either a histogram (with a superimposed normal curve) or a Normal P-P Plot'' https://statistics.laerd.com/spss-tutorials/multiple-regression-using-spss-statistics.php I think I understand how to generate those but I wonder why one would not simply do a normality test on the residuals? (e.g. KS or SW). Also, supposing the residuals are not normally distributed how does one deal with that. Can the usual transformations (sqrt, log, log 10 , with reflection if negative skew) be used on the residuals to try to get a more normal distribution? Which residuals shoudl one be looking at - unstandardised, standardised, studentised etc or does it not matter? Thanks for any help received - please try and bear in mind I'm not a professional statistician, just a researcher who does a fair amount of quantitative research, mainly in social sciences. Thanks. |
Administrator
|
The major problem with testing for normality is that the test of normality is under-powered when n is low (and normality of residuals is most important), but quickly becomes over-powered as n increases (and normality of residuals becomes less important). That is why many authors advise looking at various types of plots rather than using statistical tests of normality.
I haven't had time to read it carefully yet, but this Minitab blog post looks to be on topic: http://blog.minitab.com/blog/adventures-in-statistics/how-important-are-normal-residuals-in-regression-analysis Notice the links to 2 "white papers" in the final paragraph--they supposedly give more details about the simulation studies. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thanks Bruce - that is very useful to know, regarding why people would no use the normality tests. I will check out the link provided.
Regards
Mike From: Bruce Weaver [via SPSSX Discussion] <ml-node+[hidden email]>
Sent: 12 December 2016 21:12 To: researcher Subject: Re: normal distribution of residuals - appropriate tests for The major problem with testing for normality is that the test of normality is under-powered when n is low (and normality of residuals is most important), but quickly becomes over-powered as n increases (and normality of residuals becomes less important).
That is why many authors advise looking at various types of plots rather than using statistical tests of normality.
I haven't had time to read it carefully yet, but this Minitab blog post looks to be on topic: http://blog.minitab.com/blog/adventures-in-statistics/how-important-are-normal-residuals-in-regression-analysis
Notice the links to 2 "white papers" in the final paragraph--they supposedly give more details about the simulation studies. HTH.
--
Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/normal-distribution-of-residuals-appropriate-tests-for-tp5733587p5733588.html
|
Free forum by Nabble | Edit this page |