Half-normal distributed DV in generalized linear model?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Half-normal distributed DV in generalized linear model?

Kirill Orlov
My dependent variable is, by origin, absolute residual* left after some
regression; it is distributed half-normally. I never regressed such
distributed DV. I consider to use Generalized linear model (GENLIN) to
regress it on some predictors (totally different from those which produced
the residuals). What distribution type should I use for the DV? For
continuous data, GENLIN offers Gamma, Inverse Gaussian and Tweedie
distributions. Which to choose to model the half-normal? Or should I apply
special transforms before? And what link function would be most
appropriate? What can you advice on that? Thanks.

*More precisely, I'm analysing positive residuals separately and negative
residuals separately.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Half-normal distributed DV in generalized linear model?

Art Kendall
I do not have a direct answer but have a gut reaction.

You talk about analyzing subgroups separately.  Why not include a dichotomous predictor and use it and interaction terms involving it in your new analysis?

Art Kendall
Social Research Consultants

On 11/13/2011 4:30 AM, KO wrote:
My dependent variable is, by origin, absolute residual* left after some
regression; it is distributed half-normally. I never regressed such
distributed DV. I consider to use Generalized linear model (GENLIN) to
regress it on some predictors (totally different from those which produced
the residuals). What distribution type should I use for the DV? For
continuous data, GENLIN offers Gamma, Inverse Gaussian and Tweedie
distributions. Which to choose to model the half-normal? Or should I apply
special transforms before? And what link function would be most
appropriate? What can you advice on that? Thanks.

*More precisely, I'm analysing positive residuals separately and negative
residuals separately.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Half-normal distributed DV in generalized linear model?

Rich Ulrich
In reply to this post by Kirill Orlov
It seems to me that if there is going to be much to be found, by
reducing of large residuals, you must have some large outliers
to start with.  What else motivates this proposed analysis?

 - In there are outliers, "half-normal" will be an understatement
concerning the length of the tail.

 - I would probably start with a simple, direct examination of
outliers.  But you don't mention either the N or the R^2 achieved
by the original analysis, so it is hard to imagine what you are
working with.

--
Rich Ulrich


> Date: Sun, 13 Nov 2011 04:30:01 -0500

> From: [hidden email]
> Subject: Half-normal distributed DV in generalized linear model?
> To: [hidden email]
>
> My dependent variable is, by origin, absolute residual* left after some
> regression; it is distributed half-normally. I never regressed such
> distributed DV. I consider to use Generalized linear model (GENLIN) to
> regress it on some predictors (totally different from those which produced
> the residuals). What distribution type should I use for the DV? For
> continuous data, GENLIN offers Gamma, Inverse Gaussian and Tweedie
> distributions. Which to choose to model the half-normal? Or should I apply
> special transforms before? And what link function would be most
> appropriate? What can you advice on that? Thanks.
>
> *More precisely, I'm analysing positive residuals separately and negative
> residuals separately.
>

Reply | Threaded
Open this post in threaded view
|

Re: Half-normal distributed DV in generalized linear model?

Ryan
In reply to this post by Kirill Orlov
You stated:

"...My dependent variable is, by origin, absolute residual* left after some
regression..."

First, please define "some regression." Second, please let us know what your research question(s) is/are with respect to the first regression and, the regression on the residuals.Third, please provide more information about your research study, in general.

Ryan

On Sun, Nov 13, 2011 at 4:30 AM, KO <[hidden email]> wrote:
My dependent variable is, by origin, absolute residual* left after some
regression; it is distributed half-normally. I never regressed such
distributed DV. I consider to use Generalized linear model (GENLIN) to
regress it on some predictors (totally different from those which produced
the residuals). What distribution type should I use for the DV? For
continuous data, GENLIN offers Gamma, Inverse Gaussian and Tweedie
distributions. Which to choose to model the half-normal? Or should I apply
special transforms before? And what link function would be most
appropriate? What can you advice on that? Thanks.

*More precisely, I'm analysing positive residuals separately and negative
residuals separately.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Half-normal distributed DV in generalized linear model?

Kirill Orlov
In reply to this post by Kirill Orlov
I will will bate your curiosity about my study, as long as you ask.

Children's body weight was regressed on theit height in nonlinear regression minimizing absolute deviations (that is, approximating conditional median). The body weight was preliminary power-transformed to conquer heteroscedasticity, so the residuals are quite homoscedastic and can be directly compared between children of different heights.

Now I want to explore how the residual (i.e. deviation from normal=median weight) is "dependent" on a set of various chronical disease diagnoses, these IVs being binary (diagnosis present/absent). Of course, positive deviations and negative deviations of body weight are to be regressed separately (consider, for example, "endocrine disease" diagnosis; it positively correlates with positive deviations, due to cases of obesity, and it hardly correlates with negative deviations).

So, my task is streightforward multiple regression task. However the DV in my case is radically skewed, because it is one half of almost normally distributed variable: it is half-normal, it has only right tail and cut in place of the left tail. No nonparametric ranking techniques will do: I don't want any uniform distribution, I cherish the original distribution. I feel like using GENLIN with Gamma type distribution for my half-normal distribution, but I'm not sure.
Reply | Threaded
Open this post in threaded view
|

Re: Half-normal distributed DV in generalized linear model?

Rich Ulrich
Thanks for the detail.  It sounds like you have taken care with your
first regression, and your residuals (your new DV) should be
well-behaved -- in the sense that those scores should have the
equal-interval property.  That is the main concern for doing OLS
regression.

There is no requirement for OLS regression that the DV should
be normal; I remind you that the only technical requirements are
on the residual.  The more-or-less ideal case, for easily meeting
requirements, is that both the DV and the IVs are normal.  It sounds
to me as if your new IVs are mainly dichotomies ("varioius ... diagnoses")
which may be rare, so that your prediction equation is not apt to
result in a normal-shaped set of scores, either.

The purpose of those link functions, as I see it, is to account for some
underlying, unequal intervals expected when those scores were created
by particular generating processes, thus resulting in a particular, natural
shape of a distribution...  which has unequal intervals and errors. 
You do not have that circumstance.  You have an odd-shaped DV, but
you still have equal intervals.

The requirement for proper interpretation of the resulting tests
in regression is that the new residuals should something like normal,
and it looks to me like you are most likely to get that without any
unusual link function.

 - I consider my comments above to be a generally well-informed
opinion, but not an irrefutable or unalterable one.
 - I'm not an expert on link functions, so I would not mind hearing some
confirmation if someone thinks I'm right.

--
Rich Ulrich


> Date: Sun, 13 Nov 2011 22:41:24 -0800

> From: [hidden email]
> Subject: Re: Half-normal distributed DV in generalized linear model?
> To: [hidden email]
>
> I will will bate your curiosity about my study, as long as you ask.
>
> Children's body weight was regressed on theit height in nonlinear regression
> minimizing absolute deviations (that is, approximating conditional median).
> The body weight was preliminary power-transformed to conquer
> heteroscedasticity, so the residuals are quite homoscedastic and can be
> directly compared between children of different heights.
>
> Now I want to explore how the residual (i.e. deviation from normal=median
> weight) is "dependent" on a set of various chronical disease diagnoses,
> these IVs being binary (diagnosis present/absent). Of course, positive
> deviations and negative deviations of body weight are to be regressed
> separately (consider, for example, "endocrine disease" diagnosis; it
> positively correlates with positive deviations, due to cases of obesity, and
> it hardly correlates with negative deviations).
>
> So, my task is streightforward multiple regression task. However the DV in
> my case is radically skewed, because it is one half of almost normally
> distributed variable: it is half-normal, it has only right tail and cut in
> place of the left tail. No nonparametric ranking techniques will do: I don't
> want any uniform distribution, I cherish the original distribution. I feel
> like using GENLIN with Gamma type distribution for my half-normal
> distribution, but I'm not sure.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Half-normal-distributed-DV-in-generalized-linear-model-tp4988259p4989902.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.