Interpretation from transformed variable in ANOVA

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Interpretation from transformed variable in ANOVA

Jatender Mohal
Hi list,

While working on GLM univariate, the skewed dependent variable was
transformed to follow normal distribution.

While interpreting the descriptive estimated from the model, do I need to
consider the values after transformation? If yes, how does it may effect the
interpretation?

Thanks in Advance



Mohal
Reply | Threaded
Open this post in threaded view
|

Re: Interpretation from transformed variable in ANOVA

Hector Maletta
You shouldn't seek an easy applause
By cheating on assumptions.
You shouldn't try to force a Gauss
With such daring presumption.

However, if you do, the interpretation is on the transformed data. For
instance, if you worked on the logarithm of the original variable, any
conclusion from ANOVA refers to the logarithm, not to the original variable.

Hector


(Nor should you criticize this poem for failing to have a proper rhyme).

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Jatender Mohal
Enviado el: Friday, August 25, 2006 9:53 AM
Para: [hidden email]
Asunto: Interpretation from transformed variable in ANOVA

Hi list,

While working on GLM univariate, the skewed dependent variable was
transformed to follow normal distribution.

While interpreting the descriptive estimated from the model, do I need to
consider the values after transformation? If yes, how does it may effect the
interpretation?

Thanks in Advance



Mohal
Reply | Threaded
Open this post in threaded view
|

Re: Interpretation from transformed variable in ANOVA

Jatender Mohal
Hello Hector,

Thanks for the signal!

Still, things in mind...
The screening of a continuous variable for normality (univariate or
multivariate) assumption is an early important step of inferential
statistics. If the data is not normal, possibilities to get solutions are,
1, Nonparametric tests 2, Suitable transformation to the nonnormal data.
Nonparametric test, okay, why the solution is degraded when the data is
forced to normal by suitable transformation to valid the model's
assumptions?
Is this a very subjective issue that most inferential statistics are robust
to the departure of the assumptions?

Kind regards

Mohal


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 25 August 2006 15:04
To: [hidden email]
Subject: Re: Interpretation from transformed variable in ANOVA

You shouldn't seek an easy applause
By cheating on assumptions.
You shouldn't try to force a Gauss
With such daring presumption.

However, if you do, the interpretation is on the transformed data. For
instance, if you worked on the logarithm of the original variable, any
conclusion from ANOVA refers to the logarithm, not to the original variable.

Hector


(Nor should you criticize this poem for failing to have a proper rhyme).

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Jatender Mohal
Enviado el: Friday, August 25, 2006 9:53 AM
Para: [hidden email]
Asunto: Interpretation from transformed variable in ANOVA

Hi list,

While working on GLM univariate, the skewed dependent variable was
transformed to follow normal distribution.

While interpreting the descriptive estimated from the model, do I need to
consider the values after transformation? If yes, how does it may effect the
interpretation?

Thanks in Advance



Mohal
Reply | Threaded
Open this post in threaded view
|

Re: Interpretation from transformed variable in ANOVA

Hector Maletta
The matter has been discussed many times in this list. A normal Gaussian
frequency distribution of the dependent variable is NOT a requirement of the
Generalized Linear Model in its various incarnations such as ANOVA or Linear
Regression. Where normality enters the scene is in two places:
1. Normal sampling distribution: The sample is a random sample, and
therefore differences between all possible samples follow a normal
distribution, whose mean tends to coincide with the population mean as
sample size increases. As a result, normal significance tests apply.
2. Normal distribution of residuals: In a regression equation like y=a+bx+e,
errors or residuals "e" for each value of X are normally distributed about
the Y value predicted by the regression equation, with zero mean. Therefore,
the least squares algorithm applies.
Now, even if not absolutely forbidden, a variable whose distribution is
extremely skewed may nonetheless have a very high variance, and the sample
size required to obtain a given level of standard error or a given level of
significance will be correspondingly larger. Also, if the variable
distribution is extremely skewed, some extreme values may have a
disproportionate influence on the results; and the situation is also likely
to be accompanied by heterokedasticity (i.e. the variance of residuals may
be different for different parts of the variable range). Notice, however
that non-normality of the variable is neither a necessary nor a sufficient
cause for heterokedasticity. The latter can be present in the tail of a
normal curve, or absent in the tail of a skewed curve. Also, notice that the
literature tends to suggest that moderate heterokedasticity is tolerable, in
the sense of not causing immoderate damage to the quality of results
obtained by regression or ANOVA.
Hector

-----Mensaje original-----
De: Jatender Mohal [mailto:[hidden email]]
Enviado el: Friday, August 25, 2006 11:44 AM
Para: 'Hector Maletta'
CC: [hidden email]
Asunto: RE: Interpretation from transformed variable in ANOVA

Hello Hector,

Thanks for the signal!

Still, things in mind...
The screening of a continuous variable for normality (univariate or
multivariate) assumption is an early important step of inferential
statistics. If the data is not normal, possibilities to get solutions are,
1, Nonparametric tests 2, Suitable transformation to the nonnormal data.
Nonparametric test, okay, why the solution is degraded when the data is
forced to normal by suitable transformation to valid the model's
assumptions?
Is this a very subjective issue that most inferential statistics are robust
to the departure of the assumptions?

Kind regards

Mohal


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 25 August 2006 15:04
To: [hidden email]
Subject: Re: Interpretation from transformed variable in ANOVA

You shouldn't seek an easy applause
By cheating on assumptions.
You shouldn't try to force a Gauss
With such daring presumption.

However, if you do, the interpretation is on the transformed data. For
instance, if you worked on the logarithm of the original variable, any
conclusion from ANOVA refers to the logarithm, not to the original variable.

Hector


(Nor should you criticize this poem for failing to have a proper rhyme).

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Jatender Mohal
Enviado el: Friday, August 25, 2006 9:53 AM
Para: [hidden email]
Asunto: Interpretation from transformed variable in ANOVA

Hi list,

While working on GLM univariate, the skewed dependent variable was
transformed to follow normal distribution.

While interpreting the descriptive estimated from the model, do I need to
consider the values after transformation? If yes, how does it may effect the
interpretation?

Thanks in Advance



Mohal
Reply | Threaded
Open this post in threaded view
|

Re: Interpretation from transformed variable in ANOVA

Hector Maletta
In reply to this post by Jatender Mohal
You are welcome. A small correction to the first point of my message, which
may seem obvious to many, but worth clarifying anyway. Added words are
capitalized:
1. Normal sampling distribution: The sample is a random sample, and
therefore differences between THE MEANS OF all possible samples OF THE SAME
POPULATION follow a normal distribution whose mean tends to coincide with
the population mean as sample size increases. As a result, normal
significance tests apply.
Hector

-----Mensaje original-----
De: Jatender Mohal [mailto:[hidden email]]
Enviado el: Friday, August 25, 2006 1:58 PM
Para: 'Hector Maletta'
Asunto: RE: Interpretation from transformed variable in ANOVA

Hector,

Thanks for your suggestions

Kind regards
Mohal


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 25 August 2006 16:25
To: [hidden email]
Subject: Re: Interpretation from transformed variable in ANOVA

The matter has been discussed many times in this list. A normal Gaussian
frequency distribution of the dependent variable is NOT a requirement of the
Generalized Linear Model in its various incarnations such as ANOVA or Linear
Regression. Where normality enters the scene is in two places:
1. Normal sampling distribution: The sample is a random sample, and
therefore differences between all possible samples follow a normal
distribution, whose mean tends to coincide with the population mean as
sample size increases. As a result, normal significance tests apply.
2. Normal distribution of residuals: In a regression equation like y=a+bx+e,
errors or residuals "e" for each value of X are normally distributed about
the Y value predicted by the regression equation, with zero mean. Therefore,
the least squares algorithm applies.
Now, even if not absolutely forbidden, a variable whose distribution is
extremely skewed may nonetheless have a very high variance, and the sample
size required to obtain a given level of standard error or a given level of
significance will be correspondingly larger. Also, if the variable
distribution is extremely skewed, some extreme values may have a
disproportionate influence on the results; and the situation is also likely
to be accompanied by heterokedasticity (i.e. the variance of residuals may
be different for different parts of the variable range). Notice, however
that non-normality of the variable is neither a necessary nor a sufficient
cause for heterokedasticity. The latter can be present in the tail of a
normal curve, or absent in the tail of a skewed curve. Also, notice that the
literature tends to suggest that moderate heterokedasticity is tolerable, in
the sense of not causing immoderate damage to the quality of results
obtained by regression or ANOVA.
Hector

-----Mensaje original-----
De: Jatender Mohal [mailto:[hidden email]]
Enviado el: Friday, August 25, 2006 11:44 AM
Para: 'Hector Maletta'
CC: [hidden email]
Asunto: RE: Interpretation from transformed variable in ANOVA

Hello Hector,

Thanks for the signal!

Still, things in mind...
The screening of a continuous variable for normality (univariate or
multivariate) assumption is an early important step of inferential
statistics. If the data is not normal, possibilities to get solutions are,
1, Nonparametric tests 2, Suitable transformation to the nonnormal data.
Nonparametric test, okay, why the solution is degraded when the data is
forced to normal by suitable transformation to valid the model's
assumptions?
Is this a very subjective issue that most inferential statistics are robust
to the departure of the assumptions?

Kind regards

Mohal


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 25 August 2006 15:04
To: [hidden email]
Subject: Re: Interpretation from transformed variable in ANOVA

You shouldn't seek an easy applause
By cheating on assumptions.
You shouldn't try to force a Gauss
With such daring presumption.

However, if you do, the interpretation is on the transformed data. For
instance, if you worked on the logarithm of the original variable, any
conclusion from ANOVA refers to the logarithm, not to the original variable.

Hector


(Nor should you criticize this poem for failing to have a proper rhyme).

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Jatender Mohal
Enviado el: Friday, August 25, 2006 9:53 AM
Para: [hidden email]
Asunto: Interpretation from transformed variable in ANOVA

Hi list,

While working on GLM univariate, the skewed dependent variable was
transformed to follow normal distribution.

While interpreting the descriptive estimated from the model, do I need to
consider the values after transformation? If yes, how does it may effect the
interpretation?

Thanks in Advance



Mohal
Reply | Threaded
Open this post in threaded view
|

Re: Interpretation from transformed variable in ANOVA

Hector Maletta
In reply to this post by Jatender Mohal
Or more exactly:
1. Normal sampling distribution: The sample is a random sample, and
therefore THE MEANS OF all possible samples OF THE SAME POPULATION follow a
normal distribution whose mean tends to coincide with the population mean as
sample size increases. As a result, normal significance tests apply.
Hector

-----Mensaje original-----
De: Hector Maletta [mailto:[hidden email]]
Enviado el: Friday, August 25, 2006 2:04 PM
Para: 'Jatender Mohal'
CC: '[hidden email]'
Asunto: RE: Interpretation from transformed variable in ANOVA

You are welcome. A small correction to the first point of my message, which
may seem obvious to many, but worth clarifying anyway. Added words are
capitalized:
1. Normal sampling distribution: The sample is a random sample, and
therefore differences between THE MEANS OF all possible samples OF THE SAME
POPULATION follow a normal distribution whose mean tends to coincide with
the population mean as sample size increases. As a result, normal
significance tests apply.
Hector

-----Mensaje original-----
De: Jatender Mohal [mailto:[hidden email]]
Enviado el: Friday, August 25, 2006 1:58 PM
Para: 'Hector Maletta'
Asunto: RE: Interpretation from transformed variable in ANOVA

Hector,

Thanks for your suggestions

Kind regards
Mohal


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 25 August 2006 16:25
To: [hidden email]
Subject: Re: Interpretation from transformed variable in ANOVA

The matter has been discussed many times in this list. A normal Gaussian
frequency distribution of the dependent variable is NOT a requirement of the
Generalized Linear Model in its various incarnations such as ANOVA or Linear
Regression. Where normality enters the scene is in two places:
1. Normal sampling distribution: The sample is a random sample, and
therefore differences between all possible samples follow a normal
distribution, whose mean tends to coincide with the population mean as
sample size increases. As a result, normal significance tests apply.
2. Normal distribution of residuals: In a regression equation like y=a+bx+e,
errors or residuals "e" for each value of X are normally distributed about
the Y value predicted by the regression equation, with zero mean. Therefore,
the least squares algorithm applies.
Now, even if not absolutely forbidden, a variable whose distribution is
extremely skewed may nonetheless have a very high variance, and the sample
size required to obtain a given level of standard error or a given level of
significance will be correspondingly larger. Also, if the variable
distribution is extremely skewed, some extreme values may have a
disproportionate influence on the results; and the situation is also likely
to be accompanied by heterokedasticity (i.e. the variance of residuals may
be different for different parts of the variable range). Notice, however
that non-normality of the variable is neither a necessary nor a sufficient
cause for heterokedasticity. The latter can be present in the tail of a
normal curve, or absent in the tail of a skewed curve. Also, notice that the
literature tends to suggest that moderate heterokedasticity is tolerable, in
the sense of not causing immoderate damage to the quality of results
obtained by regression or ANOVA.
Hector

-----Mensaje original-----
De: Jatender Mohal [mailto:[hidden email]]
Enviado el: Friday, August 25, 2006 11:44 AM
Para: 'Hector Maletta'
CC: [hidden email]
Asunto: RE: Interpretation from transformed variable in ANOVA

Hello Hector,

Thanks for the signal!

Still, things in mind...
The screening of a continuous variable for normality (univariate or
multivariate) assumption is an early important step of inferential
statistics. If the data is not normal, possibilities to get solutions are,
1, Nonparametric tests 2, Suitable transformation to the nonnormal data.
Nonparametric test, okay, why the solution is degraded when the data is
forced to normal by suitable transformation to valid the model's
assumptions?
Is this a very subjective issue that most inferential statistics are robust
to the departure of the assumptions?

Kind regards

Mohal


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: 25 August 2006 15:04
To: [hidden email]
Subject: Re: Interpretation from transformed variable in ANOVA

You shouldn't seek an easy applause
By cheating on assumptions.
You shouldn't try to force a Gauss
With such daring presumption.

However, if you do, the interpretation is on the transformed data. For
instance, if you worked on the logarithm of the original variable, any
conclusion from ANOVA refers to the logarithm, not to the original variable.

Hector


(Nor should you criticize this poem for failing to have a proper rhyme).

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
Jatender Mohal
Enviado el: Friday, August 25, 2006 9:53 AM
Para: [hidden email]
Asunto: Interpretation from transformed variable in ANOVA

Hi list,

While working on GLM univariate, the skewed dependent variable was
transformed to follow normal distribution.

While interpreting the descriptive estimated from the model, do I need to
consider the values after transformation? If yes, how does it may effect the
interpretation?

Thanks in Advance



Mohal