|
Hi all,
Could anyone provide me with a "best practices" approach to analyzing non-normal and/or non-linear data with regression? Specifically, I know that some use data transformations (log, square root, etc.), but before I take this, or any other approach, I'd like to make sure that I've got a good grasp on the pros/cons of each approach. Any information or good outside sources would be greatly appreciated. Thanks much! April ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
see chapter 1 at
https://openaccess.leidenuniv.nl/handle/1887/12096 Regards, Anita van der Kooij Data Theory Group Leiden University ________________________________ From: SPSSX(r) Discussion on behalf of A Seifert Sent: Mon 03/03/2008 20:40 To: [hidden email] Subject: Non-normal/linear data & regression Hi all, Could anyone provide me with a "best practices" approach to analyzing non-normal and/or non-linear data with regression? Specifically, I know that some use data transformations (log, square root, etc.), but before I take this, or any other approach, I'd like to make sure that I've got a good grasp on the pros/cons of each approach. Any information or good outside sources would be greatly appreciated. Thanks much! April ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by A Seifert-2
Quoting A Seifert <[hidden email]>:
> Could anyone provide me with a "best practices" approach to analyzing > non-normal and/or non-linear data with regression? Specifically, I > know that some use data transformations (log, square root, etc.), but > before I take this, or any other approach, I'd like to make sure that > I've got a good grasp on the pros/cons of each approach. Any > information or good outside sources would be greatly appreciated. > First of all, NONE of the variables, dependent or independent need to be normally distributed. What IS required is that when you have fitted a regression, the errors of estimation, i.e. the "residuals" are normally distributed. The first thing that you have to sort out is the LINEARITY of the regression. The key to this is sometimes plotting the original variables, but if you are doing multiple regression, i.e. more than one independent variable, then it is difficult to produce plots which show how the combinations of variables interact. In this case you plot the residuals and check these for linearity. If they fail, then the plots may give you some idea of what to try, and you transform some or all of the X variables (that is the independent ones). Transformations such as square roots reduce the smaller variables to some extent, but reduce the larger values much more. Other variables, such as reciprocals, have the biggest effect on the smaller values. The other thing that you want to achieve is to get residuals about the same size all the way along the regression line. If, for example, the largest residuals occur against the largest Y values, you will need to transform the Y variable. You may need to tinker with transformations both on the X variables and the Y variables before you get both linearity and constant expected sizes for the residuals. Again, KEEP PLOTTING residuals and LOOKING at the plots. Beginners often spend too much time worrying about the numerical results, which at best only tell you that something is wrong, while the plots both show what is wrong and give you ideas about what to do to put things right. Now, just to tidy things up, we need to distinguish between regressions which are "linear in the parameters" and those which are inherently non-linear. You may find that some theoretical models, e.g. y = a + b * sqrt (x) can be linearised by transforming variables, as when y = c + d * x*x. Others, such as y = a + b to the power of x cannot be handled in this way, and need non-linear regression methods. These are not easy for the beginner, and often not easy for anyone. David Hitchin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Kooij, A.J. van der
Besides the good reference given by Anita van der Kooij, may I mention that
non-linear is one thing (regression is indeed about linear relationships) but non-normal is quite another (variables do not need to be normally distributed to be tractable by linear regression)? What is required by linear regression is that the distribution of residuals, i.e. the distribution of actual points about the regression line, is normal, but the distribution of the variables themselves may perfectly be non-normal. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kooij, A.J. van der Sent: 03 March 2008 18:06 To: [hidden email] Subject: Re: Non-normal/linear data & regression see chapter 1 at https://openaccess.leidenuniv.nl/handle/1887/12096 Regards, Anita van der Kooij Data Theory Group Leiden University ________________________________ From: SPSSX(r) Discussion on behalf of A Seifert Sent: Mon 03/03/2008 20:40 To: [hidden email] Subject: Non-normal/linear data & regression Hi all, Could anyone provide me with a "best practices" approach to analyzing non-normal and/or non-linear data with regression? Specifically, I know that some use data transformations (log, square root, etc.), but before I take this, or any other approach, I'd like to make sure that I've got a good grasp on the pros/cons of each approach. Any information or good outside sources would be greatly appreciated. Thanks much! April ____________________________________________________________________________ ________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
