This post was updated on .
I had a dataset with 7 IVs and the DV : wanted to use linear multiple linear regression. The DV was very far from normal (according to usual tests). Very strong negative skew so I used reflection and SQRT transformation and (to pleasant surprise) the Dv then passed normality tests (p>.05). This was good but I noticed that the direction of the relationship between one of the IVs and the DV had changed - e.g. what had been a strongly positive beta (0.9) had become approx -.0.9.
So I am wondering is it 'usual' to also transform the IVs when using this sort of transformation on the DV? Some other assumptions of MLR were met - no issues with multicollinearity, heteroscascidicty BTW I realise logistic regression would have been an option but there is a small N (45) and larger samples are generally recommended with logistic. Initial experiments with this gave a very large exp (B) for one IV but yet was still not sig... I am not a statistician - I am a mixed methods researcher with some experience in quantitative research so highly technical answers may well blow over my head... Any help appreciated though Thanks |
or possibly one just allows for the reversal in interpretation - i.e. if you did reflection + SQRT transformation of the DV, and the IV bow shows a negative relationship with DV (e.g. beta -.07) then one owuld interpret that as meaning that ''in reality'' the relationship is positive (beta .07)?
|
In reply to this post by researcher
The normality assumption applies to the error terms, not the dependent variable. So while transformations of the dependent variable may be appropriate, that does not follow from its nonnormality. Second, you say that logistic regression would have been appropriate, but logistic is for a dichotomous variable, which would, of course, have a nonnormal distribution. If that is what you have, then the transformations described do not make sense, and avoiding logistic or other similar methods is probably the wrong strategy. On Mon, Nov 14, 2016 at 6:46 AM, researcher <[hidden email]> wrote: I had a dataset with 7 IVs and the DV : wanted to use linear multiple linear |
Hi Jon
Thanks for that. On logistic - yes I am aware of the need for a dichotmous Dv and recoded the DV appropriately to get that, when I was exploring the option of logistic regression, However what I am interested in is how, in multiple linear regression, to interpret the beta of IVs when the DV (continous) has been transformed through reflection and SQRT. Thank you |
Administrator
|
In reply to this post by Jon Peck
Re the normality assumption applying to the errors, HEAR, HEAR! ;-)
Perhaps the OP meant ordinal logistic regression? To the OP: Please tell us what the variables are. And please note that with n = 45 and 7 explanatory variables, you have a recipe for severe over-fitting, regardless of what type of model it is. See Mike Babyak's nice article on over-fitting for more info about that. https://people.duke.edu/~mababyak/papers/babyakregression.pdf HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
No I really didn't mean ordinal logistic regression . And I am not sure that it matters for the purposes of this question, what the variables were? My question is about how, in multiple linear regression, to interpret the beta values of IVs when the DV has been transformed using refection and SQRT.
|
In reply to this post by researcher
Ok. What I am suggesting is that you go back to the original formulation and assess the normality issue by looking at residuals, not at the dependent variable. The residual histogram and the other plots available from the regression, particularly residuals vs fitted values, would be the place to start. On Mon, Nov 14, 2016 at 7:30 AM, researcher <[hidden email]> wrote: Hi Jon |
In reply to this post by researcher
So you originally had (say) something like a 0-100 score like a class grade, where a high score meant
knowledge. After you reversed it, a high score meant "error-score". You can either talk about (1) a high correlation between knowledge and a predictor; or (2) a high negative correlation between errors and the predictor, if you insist on not-omitting the sign of the prediction in your statement.
One simple solution might be to take one more step in your transformation: reverse the scoring of the transformed variable by taking one more step. "A better-distributed test score than the usual score was created as follows. To avoid confusion, it runs from 0 to 10 instead of 0-100. This 0-to-10 score was computed by subtracting the square-root of the error score from 10, preserving the original notion that a high score is good."
Most people would probably not bother with that. They would toss in a comment, "You see negative r's with the test score because the transformed version runs in the opposite direction.
-- Rich Ulrich From: SPSSX(r) Discussion <[hidden email]> on behalf of researcher <[hidden email]>
Sent: Monday, November 14, 2016 9:48 AM To: [hidden email] Subject: Re: TRANSFORMATION OF IVS and DVs with reflection - multiple linear regression No I really didn't mean ordinal logistic regression . And I am not sure that
it matters for the purposes of this question, what the variables were? My question is about how, in multiple linear regression, to interpret the beta values of IVs when the DV has been transformed using refection and SQRT. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/TRANSFORMATION-OF-IVS-and-DVs-with-reflection-multiple-linear-regression-tp5733454p5733460.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
In reply to this post by Jon Peck
Thanks Jon.
I am afraid I do not have time to do that in this case. The cruder answer to this problem, as I have found through further reading, is either a) interpret in reverse (i.e. a negative relationship of IV and DV, where the DV has been reflected should actually be understood as a positive relationship) b) re-reflection Thanks for your help |
In reply to this post by Rich Ulrich
Thanks Rich - I didn't see your message till now. Yes I came to that conclusion in the end too, but the re-reflection might be worth it in this case rather than trying to explain to the audience that they have to think it the other way round...and t shoudl only take a few minutes ot do in SPSS.
Thanks for the post - useful to know I was on the right track |
Administrator
|
In reply to this post by researcher
Okay. Did you mean binary logistic regression? I ask, because you haven't actually said so. You've only said it was not ordinal logistic regression. It could be you meant multinomial logistic regression. My point is that you're being rather stingy with information about the problem.
I asked what the variables actually are because knowing what they are might be helpful to someone who is trying to help you. E.g., some types of variables suggest certain transformations. Depending on what the variables are, someone might have suggested something other than taking square roots and reflecting. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Bruce I appreciate your help but if you read my posts it is clear that I am asking about multiple linear regression and only mentioned logistic in passing (as something that I had considered but rejected). I agree I could have said more about the variables etc but time was short, on a deadline, and I did not think it was that relevant for the problem. I do appreciate the expertise available in this group but in this case was just after a 'quick and dirty' solution for want of a better term.
|
As a side note, consider that transformation might not be necessary. Many people jump to non-parametric tests or transformations right away, but regression is actually robust to violations of normality and even if normality is violated, the regression coefficients will remain unbiased (but not the significance). The issue with normality can also be overcome with bootstrapping. There are additional options like robust regression (through the R plug-in).
As far as interpretation of coefficients is concerned, you can back-transform your coefficients after most types of transformations. The caveat is that you can't back-transform SEs, only CIs. This is a helpful paper about assumptions of regression: Williams MN, Grajales CAG, Kurkiewicz D (2013) Assumptions of multiple regression: correcting two misconceptions. Practical Assessment, Research & Evaluation 18:2. http://pareonline.net/getvn.asp?v=18&n=11 ________________________________________ From: SPSSX(r) Discussion [[hidden email]] on behalf of researcher [[hidden email]] Sent: Monday, November 14, 2016 3:48 PM To: [hidden email] Subject: Re: TRANSFORMATION OF IVS and DVs with reflection - multiple linear regression Bruce I appreciate your help but if you read my posts it is clear that I am asking about multiple linear regression and only mentioned logistic in passing (as something that I had considered but rejected). I agree I could have said more about the variables etc but time was short, on a deadline, and I did not think it was that relevant for the problem. I do appreciate the expertise available in this group but in this case was just after a 'quick and dirty' solution for want of a better term. CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain confidential and privileged information for the use of the designated recipients named above. If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. If you have received this communication in error, please notify me immediately by replying to this message and destroy all copies of this communication and any attachments. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |