Non-normal/linear data & regression

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Non-normal/linear data & regression

A Seifert-2
Hi all,

Could anyone provide me with a "best practices" approach to analyzing non-normal and/or non-linear data with regression?  Specifically, I know that some use data transformations (log, square root, etc.), but before I take this, or any other approach, I'd like to make sure that I've got a good grasp on the pros/cons of each approach.  Any information or good outside sources would be greatly appreciated.

Thanks much!
April




      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Non-normal/linear data & regression

Kooij, A.J. van der
see chapter 1 at
https://openaccess.leidenuniv.nl/handle/1887/12096
 
Regards,
Anita van der Kooij
Data Theory Group
Leiden University

________________________________

From: SPSSX(r) Discussion on behalf of A Seifert
Sent: Mon 03/03/2008 20:40
To: [hidden email]
Subject: Non-normal/linear data & regression



Hi all,

Could anyone provide me with a "best practices" approach to analyzing non-normal and/or non-linear data with regression?  Specifically, I know that some use data transformations (log, square root, etc.), but before I take this, or any other approach, I'd like to make sure that I've got a good grasp on the pros/cons of each approach.  Any information or good outside sources would be greatly appreciated.

Thanks much!
April




      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**********************************************************************

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Non-normal/linear data & regression

David Hitchin
In reply to this post by A Seifert-2
Quoting A Seifert <[hidden email]>:
> Could anyone provide me with a "best practices" approach to analyzing
> non-normal and/or non-linear data with regression?  Specifically, I
> know that some use data transformations (log, square root, etc.), but
> before I take this, or any other approach, I'd like to make sure that
> I've got a good grasp on the pros/cons of each approach.  Any
> information or good outside sources would be greatly appreciated.
>
First of all, NONE of the variables, dependent or independent need to be
normally distributed. What IS required is that when you have fitted a
regression, the errors of estimation, i.e. the "residuals" are normally
distributed.

The first thing that you have to sort out is the LINEARITY of the
regression. The key to this is sometimes plotting the original
variables, but if you are doing multiple regression, i.e. more than one
independent variable, then it is difficult to produce plots which show
how the combinations of variables interact. In this case you plot the
residuals and check these for linearity. If they fail, then the plots
may give you some idea of what to try, and you transform some or all of
the X variables (that is the independent ones).

Transformations such as square roots reduce the smaller variables to
some extent, but reduce the larger values much more. Other variables,
such as reciprocals, have the biggest effect on the smaller values.

The other thing that you want to achieve is to get residuals about the
same size all the way along the regression line. If, for example, the
largest residuals occur against the largest Y values, you will need to
transform the Y variable.

You may need to tinker with transformations both on the X variables and
the Y variables before you get both linearity and constant expected
sizes for the residuals.

Again, KEEP PLOTTING residuals and LOOKING at the plots. Beginners often
spend too much time worrying about the numerical results, which at best
only tell you that something is wrong, while the plots both show what is
wrong and give you ideas about what to do to put things right.

Now, just to tidy things up, we need to distinguish between regressions
which are "linear in the parameters" and those which are inherently
non-linear. You may find that some theoretical models, e.g.
y = a + b * sqrt (x) can be linearised by transforming variables, as
when y = c + d * x*x.

Others, such as y = a + b to the power of x cannot be handled in this
way, and need non-linear regression methods. These are not easy for the
beginner, and often not easy for anyone.

David Hitchin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Non-normal/linear data & regression

Hector Maletta
In reply to this post by Kooij, A.J. van der
Besides the good reference given by Anita van der Kooij, may I mention that
non-linear is one thing (regression is indeed about linear relationships)
but non-normal is quite another (variables do not need to be normally
distributed to be tractable by linear regression)?
What is required by linear regression is that the distribution of residuals,
i.e. the distribution of actual points about the regression line, is normal,
but the distribution of the variables themselves may perfectly be
non-normal.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Kooij, A.J. van der
Sent: 03 March 2008 18:06
To: [hidden email]
Subject: Re: Non-normal/linear data & regression

see chapter 1 at
https://openaccess.leidenuniv.nl/handle/1887/12096

Regards,
Anita van der Kooij
Data Theory Group
Leiden University

________________________________

From: SPSSX(r) Discussion on behalf of A Seifert
Sent: Mon 03/03/2008 20:40
To: [hidden email]
Subject: Non-normal/linear data & regression



Hi all,

Could anyone provide me with a "best practices" approach to analyzing
non-normal and/or non-linear data with regression?  Specifically, I know
that some use data transformations (log, square root, etc.), but before I
take this, or any other approach, I'd like to make sure that I've got a good
grasp on the pros/cons of each approach.  Any information or good outside
sources would be greatly appreciated.

Thanks much!
April





____________________________________________________________________________
________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**********************************************************************


To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD