Hi list,
While working on GLM univariate, the skewed dependent variable was transformed to follow normal distribution. While interpreting the descriptive estimated from the model, do I need to consider the values after transformation? If yes, how does it may effect the interpretation? Thanks in Advance Mohal |
You shouldn't seek an easy applause
By cheating on assumptions. You shouldn't try to force a Gauss With such daring presumption. However, if you do, the interpretation is on the transformed data. For instance, if you worked on the logarithm of the original variable, any conclusion from ANOVA refers to the logarithm, not to the original variable. Hector (Nor should you criticize this poem for failing to have a proper rhyme). -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Jatender Mohal Enviado el: Friday, August 25, 2006 9:53 AM Para: [hidden email] Asunto: Interpretation from transformed variable in ANOVA Hi list, While working on GLM univariate, the skewed dependent variable was transformed to follow normal distribution. While interpreting the descriptive estimated from the model, do I need to consider the values after transformation? If yes, how does it may effect the interpretation? Thanks in Advance Mohal |
Hello Hector,
Thanks for the signal! Still, things in mind... The screening of a continuous variable for normality (univariate or multivariate) assumption is an early important step of inferential statistics. If the data is not normal, possibilities to get solutions are, 1, Nonparametric tests 2, Suitable transformation to the nonnormal data. Nonparametric test, okay, why the solution is degraded when the data is forced to normal by suitable transformation to valid the model's assumptions? Is this a very subjective issue that most inferential statistics are robust to the departure of the assumptions? Kind regards Mohal -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 25 August 2006 15:04 To: [hidden email] Subject: Re: Interpretation from transformed variable in ANOVA You shouldn't seek an easy applause By cheating on assumptions. You shouldn't try to force a Gauss With such daring presumption. However, if you do, the interpretation is on the transformed data. For instance, if you worked on the logarithm of the original variable, any conclusion from ANOVA refers to the logarithm, not to the original variable. Hector (Nor should you criticize this poem for failing to have a proper rhyme). -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Jatender Mohal Enviado el: Friday, August 25, 2006 9:53 AM Para: [hidden email] Asunto: Interpretation from transformed variable in ANOVA Hi list, While working on GLM univariate, the skewed dependent variable was transformed to follow normal distribution. While interpreting the descriptive estimated from the model, do I need to consider the values after transformation? If yes, how does it may effect the interpretation? Thanks in Advance Mohal |
The matter has been discussed many times in this list. A normal Gaussian
frequency distribution of the dependent variable is NOT a requirement of the Generalized Linear Model in its various incarnations such as ANOVA or Linear Regression. Where normality enters the scene is in two places: 1. Normal sampling distribution: The sample is a random sample, and therefore differences between all possible samples follow a normal distribution, whose mean tends to coincide with the population mean as sample size increases. As a result, normal significance tests apply. 2. Normal distribution of residuals: In a regression equation like y=a+bx+e, errors or residuals "e" for each value of X are normally distributed about the Y value predicted by the regression equation, with zero mean. Therefore, the least squares algorithm applies. Now, even if not absolutely forbidden, a variable whose distribution is extremely skewed may nonetheless have a very high variance, and the sample size required to obtain a given level of standard error or a given level of significance will be correspondingly larger. Also, if the variable distribution is extremely skewed, some extreme values may have a disproportionate influence on the results; and the situation is also likely to be accompanied by heterokedasticity (i.e. the variance of residuals may be different for different parts of the variable range). Notice, however that non-normality of the variable is neither a necessary nor a sufficient cause for heterokedasticity. The latter can be present in the tail of a normal curve, or absent in the tail of a skewed curve. Also, notice that the literature tends to suggest that moderate heterokedasticity is tolerable, in the sense of not causing immoderate damage to the quality of results obtained by regression or ANOVA. Hector -----Mensaje original----- De: Jatender Mohal [mailto:[hidden email]] Enviado el: Friday, August 25, 2006 11:44 AM Para: 'Hector Maletta' CC: [hidden email] Asunto: RE: Interpretation from transformed variable in ANOVA Hello Hector, Thanks for the signal! Still, things in mind... The screening of a continuous variable for normality (univariate or multivariate) assumption is an early important step of inferential statistics. If the data is not normal, possibilities to get solutions are, 1, Nonparametric tests 2, Suitable transformation to the nonnormal data. Nonparametric test, okay, why the solution is degraded when the data is forced to normal by suitable transformation to valid the model's assumptions? Is this a very subjective issue that most inferential statistics are robust to the departure of the assumptions? Kind regards Mohal -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 25 August 2006 15:04 To: [hidden email] Subject: Re: Interpretation from transformed variable in ANOVA You shouldn't seek an easy applause By cheating on assumptions. You shouldn't try to force a Gauss With such daring presumption. However, if you do, the interpretation is on the transformed data. For instance, if you worked on the logarithm of the original variable, any conclusion from ANOVA refers to the logarithm, not to the original variable. Hector (Nor should you criticize this poem for failing to have a proper rhyme). -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Jatender Mohal Enviado el: Friday, August 25, 2006 9:53 AM Para: [hidden email] Asunto: Interpretation from transformed variable in ANOVA Hi list, While working on GLM univariate, the skewed dependent variable was transformed to follow normal distribution. While interpreting the descriptive estimated from the model, do I need to consider the values after transformation? If yes, how does it may effect the interpretation? Thanks in Advance Mohal |
In reply to this post by Jatender Mohal
You are welcome. A small correction to the first point of my message, which
may seem obvious to many, but worth clarifying anyway. Added words are capitalized: 1. Normal sampling distribution: The sample is a random sample, and therefore differences between THE MEANS OF all possible samples OF THE SAME POPULATION follow a normal distribution whose mean tends to coincide with the population mean as sample size increases. As a result, normal significance tests apply. Hector -----Mensaje original----- De: Jatender Mohal [mailto:[hidden email]] Enviado el: Friday, August 25, 2006 1:58 PM Para: 'Hector Maletta' Asunto: RE: Interpretation from transformed variable in ANOVA Hector, Thanks for your suggestions Kind regards Mohal -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 25 August 2006 16:25 To: [hidden email] Subject: Re: Interpretation from transformed variable in ANOVA The matter has been discussed many times in this list. A normal Gaussian frequency distribution of the dependent variable is NOT a requirement of the Generalized Linear Model in its various incarnations such as ANOVA or Linear Regression. Where normality enters the scene is in two places: 1. Normal sampling distribution: The sample is a random sample, and therefore differences between all possible samples follow a normal distribution, whose mean tends to coincide with the population mean as sample size increases. As a result, normal significance tests apply. 2. Normal distribution of residuals: In a regression equation like y=a+bx+e, errors or residuals "e" for each value of X are normally distributed about the Y value predicted by the regression equation, with zero mean. Therefore, the least squares algorithm applies. Now, even if not absolutely forbidden, a variable whose distribution is extremely skewed may nonetheless have a very high variance, and the sample size required to obtain a given level of standard error or a given level of significance will be correspondingly larger. Also, if the variable distribution is extremely skewed, some extreme values may have a disproportionate influence on the results; and the situation is also likely to be accompanied by heterokedasticity (i.e. the variance of residuals may be different for different parts of the variable range). Notice, however that non-normality of the variable is neither a necessary nor a sufficient cause for heterokedasticity. The latter can be present in the tail of a normal curve, or absent in the tail of a skewed curve. Also, notice that the literature tends to suggest that moderate heterokedasticity is tolerable, in the sense of not causing immoderate damage to the quality of results obtained by regression or ANOVA. Hector -----Mensaje original----- De: Jatender Mohal [mailto:[hidden email]] Enviado el: Friday, August 25, 2006 11:44 AM Para: 'Hector Maletta' CC: [hidden email] Asunto: RE: Interpretation from transformed variable in ANOVA Hello Hector, Thanks for the signal! Still, things in mind... The screening of a continuous variable for normality (univariate or multivariate) assumption is an early important step of inferential statistics. If the data is not normal, possibilities to get solutions are, 1, Nonparametric tests 2, Suitable transformation to the nonnormal data. Nonparametric test, okay, why the solution is degraded when the data is forced to normal by suitable transformation to valid the model's assumptions? Is this a very subjective issue that most inferential statistics are robust to the departure of the assumptions? Kind regards Mohal -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 25 August 2006 15:04 To: [hidden email] Subject: Re: Interpretation from transformed variable in ANOVA You shouldn't seek an easy applause By cheating on assumptions. You shouldn't try to force a Gauss With such daring presumption. However, if you do, the interpretation is on the transformed data. For instance, if you worked on the logarithm of the original variable, any conclusion from ANOVA refers to the logarithm, not to the original variable. Hector (Nor should you criticize this poem for failing to have a proper rhyme). -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Jatender Mohal Enviado el: Friday, August 25, 2006 9:53 AM Para: [hidden email] Asunto: Interpretation from transformed variable in ANOVA Hi list, While working on GLM univariate, the skewed dependent variable was transformed to follow normal distribution. While interpreting the descriptive estimated from the model, do I need to consider the values after transformation? If yes, how does it may effect the interpretation? Thanks in Advance Mohal |
In reply to this post by Jatender Mohal
Or more exactly:
1. Normal sampling distribution: The sample is a random sample, and therefore THE MEANS OF all possible samples OF THE SAME POPULATION follow a normal distribution whose mean tends to coincide with the population mean as sample size increases. As a result, normal significance tests apply. Hector -----Mensaje original----- De: Hector Maletta [mailto:[hidden email]] Enviado el: Friday, August 25, 2006 2:04 PM Para: 'Jatender Mohal' CC: '[hidden email]' Asunto: RE: Interpretation from transformed variable in ANOVA You are welcome. A small correction to the first point of my message, which may seem obvious to many, but worth clarifying anyway. Added words are capitalized: 1. Normal sampling distribution: The sample is a random sample, and therefore differences between THE MEANS OF all possible samples OF THE SAME POPULATION follow a normal distribution whose mean tends to coincide with the population mean as sample size increases. As a result, normal significance tests apply. Hector -----Mensaje original----- De: Jatender Mohal [mailto:[hidden email]] Enviado el: Friday, August 25, 2006 1:58 PM Para: 'Hector Maletta' Asunto: RE: Interpretation from transformed variable in ANOVA Hector, Thanks for your suggestions Kind regards Mohal -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 25 August 2006 16:25 To: [hidden email] Subject: Re: Interpretation from transformed variable in ANOVA The matter has been discussed many times in this list. A normal Gaussian frequency distribution of the dependent variable is NOT a requirement of the Generalized Linear Model in its various incarnations such as ANOVA or Linear Regression. Where normality enters the scene is in two places: 1. Normal sampling distribution: The sample is a random sample, and therefore differences between all possible samples follow a normal distribution, whose mean tends to coincide with the population mean as sample size increases. As a result, normal significance tests apply. 2. Normal distribution of residuals: In a regression equation like y=a+bx+e, errors or residuals "e" for each value of X are normally distributed about the Y value predicted by the regression equation, with zero mean. Therefore, the least squares algorithm applies. Now, even if not absolutely forbidden, a variable whose distribution is extremely skewed may nonetheless have a very high variance, and the sample size required to obtain a given level of standard error or a given level of significance will be correspondingly larger. Also, if the variable distribution is extremely skewed, some extreme values may have a disproportionate influence on the results; and the situation is also likely to be accompanied by heterokedasticity (i.e. the variance of residuals may be different for different parts of the variable range). Notice, however that non-normality of the variable is neither a necessary nor a sufficient cause for heterokedasticity. The latter can be present in the tail of a normal curve, or absent in the tail of a skewed curve. Also, notice that the literature tends to suggest that moderate heterokedasticity is tolerable, in the sense of not causing immoderate damage to the quality of results obtained by regression or ANOVA. Hector -----Mensaje original----- De: Jatender Mohal [mailto:[hidden email]] Enviado el: Friday, August 25, 2006 11:44 AM Para: 'Hector Maletta' CC: [hidden email] Asunto: RE: Interpretation from transformed variable in ANOVA Hello Hector, Thanks for the signal! Still, things in mind... The screening of a continuous variable for normality (univariate or multivariate) assumption is an early important step of inferential statistics. If the data is not normal, possibilities to get solutions are, 1, Nonparametric tests 2, Suitable transformation to the nonnormal data. Nonparametric test, okay, why the solution is degraded when the data is forced to normal by suitable transformation to valid the model's assumptions? Is this a very subjective issue that most inferential statistics are robust to the departure of the assumptions? Kind regards Mohal -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hector Maletta Sent: 25 August 2006 15:04 To: [hidden email] Subject: Re: Interpretation from transformed variable in ANOVA You shouldn't seek an easy applause By cheating on assumptions. You shouldn't try to force a Gauss With such daring presumption. However, if you do, the interpretation is on the transformed data. For instance, if you worked on the logarithm of the original variable, any conclusion from ANOVA refers to the logarithm, not to the original variable. Hector (Nor should you criticize this poem for failing to have a proper rhyme). -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Jatender Mohal Enviado el: Friday, August 25, 2006 9:53 AM Para: [hidden email] Asunto: Interpretation from transformed variable in ANOVA Hi list, While working on GLM univariate, the skewed dependent variable was transformed to follow normal distribution. While interpreting the descriptive estimated from the model, do I need to consider the values after transformation? If yes, how does it may effect the interpretation? Thanks in Advance Mohal |
Free forum by Nabble | Edit this page |