Poisson Variable help

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Poisson Variable help

Andrew Lawrence-2
Dear SPSS-listers,

I'm afraid this is dual question: "what to do" & "how to do it in SPSS
(if possible)". I would greatly appreciate any advice you might have on
the following problem:

I'm dealing with a variable (I shall call X) with a clear Poisson-type
distribution: X is a count of discrete incidences per individual,
observed over a sample of individuals, thus the value of X can be zero
or any positive integer. There are many zeros (~60% of observations are
zeros) and there is an extreme positive skew - the maximum observed X
count is ~150, but only 10% of individuals have X-values >10.

I've explored the idea of transforming the X-data (log, square root or
rank) and found that there is still an uncomfortable deviation from
normality in each of the cases. Throwing out the zero-data helps, but
theoretically this doesn't make a lot of sense and reduces my sample
size by 60%.

I am interested in the relationship between this X variable and a
normally distributed outcome variable (I shall call Y). Eventually I
will want to include some nuisance covariates in the model.

If the X-count variable was Gaussian I would use an OLS regression to
regress X onto Y. I could then conclude whether an increased count of X
is associated with the observed values of Y. However I'm not sure that
this is the correct approach given the non-normality of X. If my
hypothesis was the other way around (i.e. predicting X from Y) I would
run a Poisson regression to assess the relationship, but I need to
predict Y from X, and the SPSS GLM Poisson regression will not run with
the variables entered this way around. When I tried a poisson regression
with Y as the dependent and X as the predictor I received the error
message: "There are no valid cases. Statistics cannot be computed.
Execution of this command stops." .

Does anyone on the list have experience with a similar analysis? I'm a
bit confused as to the best way to approach this, is the problem
technical: in that SPSS isn't able to run the poisson regression the
other way round, or is it that the poisson regression can be performed
one way around and it's answers inferred regarding the inverse? (i.e.
knowing what the coefficients for X on Y controlling for Z are from the
coefficients of Y on X controlling for Z). A third alternative: is
Poisson regression completely the wrong way to take this analysis. Is
there a better way?

Thanks in advance,

~Andrew

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Poisson Variable help

Ryan
Andrew,

Poisson regression, a type of generalized linear model, models
non-negative integers. Cases associated with non-integer values and/or
negative values on the dependent variable should be rejected when
trying to fit a Poisson regression model. As a result, if all of the
cases in your dataset have negative or non-integer values on the
dependent variable, I would expect you to receive the sort of error
message that you received. No restrictions are placed on the
predictors in a generalized linear model.

Ryan

On Wed, Mar 16, 2011 at 2:30 PM, Andrew Lawrence <[hidden email]> wrote:

> Dear SPSS-listers,
>
> I'm afraid this is dual question: "what to do" & "how to do it in SPSS
> (if possible)". I would greatly appreciate any advice you might have on
> the following problem:
>
> I'm dealing with a variable (I shall call X) with a clear Poisson-type
> distribution: X is a count of discrete incidences per individual,
> observed over a sample of individuals, thus the value of X can be zero
> or any positive integer. There are many zeros (~60% of observations are
> zeros) and there is an extreme positive skew - the maximum observed X
> count is ~150, but only 10% of individuals have X-values >10.
>
> I've explored the idea of transforming the X-data (log, square root or
> rank) and found that there is still an uncomfortable deviation from
> normality in each of the cases. Throwing out the zero-data helps, but
> theoretically this doesn't make a lot of sense and reduces my sample
> size by 60%.
>
> I am interested in the relationship between this X variable and a
> normally distributed outcome variable (I shall call Y). Eventually I
> will want to include some nuisance covariates in the model.
>
> If the X-count variable was Gaussian I would use an OLS regression to
> regress X onto Y. I could then conclude whether an increased count of X
> is associated with the observed values of Y. However I'm not sure that
> this is the correct approach given the non-normality of X. If my
> hypothesis was the other way around (i.e. predicting X from Y) I would
> run a Poisson regression to assess the relationship, but I need to
> predict Y from X, and the SPSS GLM Poisson regression will not run with
> the variables entered this way around. When I tried a poisson regression
> with Y as the dependent and X as the predictor I received the error
> message: "There are no valid cases. Statistics cannot be computed.
> Execution of this command stops." .
>
> Does anyone on the list have experience with a similar analysis? I'm a
> bit confused as to the best way to approach this, is the problem
> technical: in that SPSS isn't able to run the poisson regression the
> other way round, or is it that the poisson regression can be performed
> one way around and it's answers inferred regarding the inverse? (i.e.
> knowing what the coefficients for X on Y controlling for Z are from the
> coefficients of Y on X controlling for Z). A third alternative: is
> Poisson regression completely the wrong way to take this analysis. Is
> there a better way?
>
> Thanks in advance,
>
> ~Andrew
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Poisson Variable help

Garry Gelade
In reply to this post by Andrew Lawrence-2
Dear Andrew

You may not need to transform your X variable at all. The distribution of Y
is more important, and in fact the critical assumption of OLS regression is
that the residuals are iid, the distribution of X is irrelevant.

I suggest you run your OLS and check the residuals for normality and
heteroscedasticity. If the residuals look OK, you can conclude you have not
violated the regression assumptions.

Garry Gelade
Business Aanalytic Ltd

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Andrew Lawrence
Sent: 16 March 2011 18:31
To: [hidden email]
Subject: Poisson Variable help

Dear SPSS-listers,

I'm afraid this is dual question: "what to do" & "how to do it in SPSS
(if possible)". I would greatly appreciate any advice you might have on
the following problem:

I'm dealing with a variable (I shall call X) with a clear Poisson-type
distribution: X is a count of discrete incidences per individual,
observed over a sample of individuals, thus the value of X can be zero
or any positive integer. There are many zeros (~60% of observations are
zeros) and there is an extreme positive skew - the maximum observed X
count is ~150, but only 10% of individuals have X-values >10.

I've explored the idea of transforming the X-data (log, square root or
rank) and found that there is still an uncomfortable deviation from
normality in each of the cases. Throwing out the zero-data helps, but
theoretically this doesn't make a lot of sense and reduces my sample
size by 60%.

I am interested in the relationship between this X variable and a
normally distributed outcome variable (I shall call Y). Eventually I
will want to include some nuisance covariates in the model.

If the X-count variable was Gaussian I would use an OLS regression to
regress X onto Y. I could then conclude whether an increased count of X
is associated with the observed values of Y. However I'm not sure that
this is the correct approach given the non-normality of X. If my
hypothesis was the other way around (i.e. predicting X from Y) I would
run a Poisson regression to assess the relationship, but I need to
predict Y from X, and the SPSS GLM Poisson regression will not run with
the variables entered this way around. When I tried a poisson regression
with Y as the dependent and X as the predictor I received the error
message: "There are no valid cases. Statistics cannot be computed.
Execution of this command stops." .

Does anyone on the list have experience with a similar analysis? I'm a
bit confused as to the best way to approach this, is the problem
technical: in that SPSS isn't able to run the poisson regression the
other way round, or is it that the poisson regression can be performed
one way around and it's answers inferred regarding the inverse? (i.e.
knowing what the coefficients for X on Y controlling for Z are from the
coefficients of Y on X controlling for Z). A third alternative: is
Poisson regression completely the wrong way to take this analysis. Is
there a better way?

Thanks in advance,

~Andrew

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Poisson Variable help

Andrew Lawrence-2
Dear Garry,

Thanks for your comment, I think you're absolutely right! I've plotted
the residuals and they are well distributed and so the assumptions of
OLS regression are not violated. Now I go back and read the relevent
sections of a textbook, I see that there is no assumption of conditional
normality for X.

It seems strange though that the appropriate method if looking at the
relationship between the two variables the opposite way around would be
Poisson regression. I suppose this must be because the questions are
different.

Thanks again,

Andrew



On 17/03/2011 13:18, Garry Gelade wrote:

> Dear Andrew
>
> You may not need to transform your X variable at all. The distribution of Y
> is more important, and in fact the critical assumption of OLS regression is
> that the residuals are iid, the distribution of X is irrelevant.
>
> I suggest you run your OLS and check the residuals for normality and
> heteroscedasticity. If the residuals look OK, you can conclude you have not
> violated the regression assumptions.
>
> Garry Gelade
> Business Aanalytic Ltd
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Andrew Lawrence
> Sent: 16 March 2011 18:31
> To: [hidden email]
> Subject: Poisson Variable help
>
> Dear SPSS-listers,
>
> I'm afraid this is dual question: "what to do"&  "how to do it in SPSS
> (if possible)". I would greatly appreciate any advice you might have on
> the following problem:
>
> I'm dealing with a variable (I shall call X) with a clear Poisson-type
> distribution: X is a count of discrete incidences per individual,
> observed over a sample of individuals, thus the value of X can be zero
> or any positive integer. There are many zeros (~60% of observations are
> zeros) and there is an extreme positive skew - the maximum observed X
> count is ~150, but only 10% of individuals have X-values>10.
>
> I've explored the idea of transforming the X-data (log, square root or
> rank) and found that there is still an uncomfortable deviation from
> normality in each of the cases. Throwing out the zero-data helps, but
> theoretically this doesn't make a lot of sense and reduces my sample
> size by 60%.
>
> I am interested in the relationship between this X variable and a
> normally distributed outcome variable (I shall call Y). Eventually I
> will want to include some nuisance covariates in the model.
>
> If the X-count variable was Gaussian I would use an OLS regression to
> regress X onto Y. I could then conclude whether an increased count of X
> is associated with the observed values of Y. However I'm not sure that
> this is the correct approach given the non-normality of X. If my
> hypothesis was the other way around (i.e. predicting X from Y) I would
> run a Poisson regression to assess the relationship, but I need to
> predict Y from X, and the SPSS GLM Poisson regression will not run with
> the variables entered this way around. When I tried a poisson regression
> with Y as the dependent and X as the predictor I received the error
> message: "There are no valid cases. Statistics cannot be computed.
> Execution of this command stops." .
>
> Does anyone on the list have experience with a similar analysis? I'm a
> bit confused as to the best way to approach this, is the problem
> technical: in that SPSS isn't able to run the poisson regression the
> other way round, or is it that the poisson regression can be performed
> one way around and it's answers inferred regarding the inverse? (i.e.
> knowing what the coefficients for X on Y controlling for Z are from the
> coefficients of Y on X controlling for Z). A third alternative: is
> Poisson regression completely the wrong way to take this analysis. Is
> there a better way?
>
> Thanks in advance,
>
> ~Andrew
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Trouble With Version 19: Cut/Paste

Kreischer,Resha M
Hello,

Is anyone else experiencing issues while cutting and pasting in SPSS version 19. When I select more than a few records to cut and paste, minutes will pass by before the function works. If I try to cut and paste 100 or more records, everything freezes up and I end up needing to shut down my computer. Cutting and pasting was never an issue in earlier versions.

I am looking forward to hearing if anyone else has this problem or knows how to resolve it.

Thanks,

Resha

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Trouble With Version 19: Cut/Paste

Arthur Kramer
I have a similar problem with "search and replace"

Arthur Kramer, Ph.D.

"Believe half of what you see and none of what you hear."

N. Whitfield
B. Strong


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Kreischer,Resha M
Sent: Thursday, March 17, 2011 3:37 PM
To: [hidden email]
Subject: Trouble With Version 19: Cut/Paste

Hello,

Is anyone else experiencing issues while cutting and pasting in SPSS version
19. When I select more than a few records to cut and paste, minutes will
pass by before the function works. If I try to cut and paste 100 or more
records, everything freezes up and I end up needing to shut down my
computer. Cutting and pasting was never an issue in earlier versions.

I am looking forward to hearing if anyone else has this problem or knows how
to resolve it.

Thanks,

Resha

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Poisson Variable help

Bruce Weaver
Administrator
In reply to this post by Andrew Lawrence-2
Just a note on terminology.  Regressing X onto Y means predicting X from Y.  Given that you want to predict Y from X, you are regressing Y onto X.


Andrew Lawrence-2 wrote
--- snip ---

If the X-count variable was Gaussian I would use an OLS regression to
regress X onto Y. I could then conclude whether an increased count of X
is associated with the observed values of Y. However I'm not sure that
this is the correct approach given the non-normality of X. If my
hypothesis was the other way around (i.e. predicting X from Y) I would
run a Poisson regression to assess the relationship, but I need to
predict Y from X, and the SPSS GLM Poisson regression will not run with
the variables entered this way around. When I tried a poisson regression
with Y as the dependent and X as the predictor I received the error
message: "There are no valid cases. Statistics cannot be computed.
Execution of this command stops." .

--- snip ---
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).