Dear SPSS-listers,
I'm afraid this is dual question: "what to do" & "how to do it in SPSS (if possible)". I would greatly appreciate any advice you might have on the following problem: I'm dealing with a variable (I shall call X) with a clear Poisson-type distribution: X is a count of discrete incidences per individual, observed over a sample of individuals, thus the value of X can be zero or any positive integer. There are many zeros (~60% of observations are zeros) and there is an extreme positive skew - the maximum observed X count is ~150, but only 10% of individuals have X-values >10. I've explored the idea of transforming the X-data (log, square root or rank) and found that there is still an uncomfortable deviation from normality in each of the cases. Throwing out the zero-data helps, but theoretically this doesn't make a lot of sense and reduces my sample size by 60%. I am interested in the relationship between this X variable and a normally distributed outcome variable (I shall call Y). Eventually I will want to include some nuisance covariates in the model. If the X-count variable was Gaussian I would use an OLS regression to regress X onto Y. I could then conclude whether an increased count of X is associated with the observed values of Y. However I'm not sure that this is the correct approach given the non-normality of X. If my hypothesis was the other way around (i.e. predicting X from Y) I would run a Poisson regression to assess the relationship, but I need to predict Y from X, and the SPSS GLM Poisson regression will not run with the variables entered this way around. When I tried a poisson regression with Y as the dependent and X as the predictor I received the error message: "There are no valid cases. Statistics cannot be computed. Execution of this command stops." . Does anyone on the list have experience with a similar analysis? I'm a bit confused as to the best way to approach this, is the problem technical: in that SPSS isn't able to run the poisson regression the other way round, or is it that the poisson regression can be performed one way around and it's answers inferred regarding the inverse? (i.e. knowing what the coefficients for X on Y controlling for Z are from the coefficients of Y on X controlling for Z). A third alternative: is Poisson regression completely the wrong way to take this analysis. Is there a better way? Thanks in advance, ~Andrew ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Andrew,
Poisson regression, a type of generalized linear model, models non-negative integers. Cases associated with non-integer values and/or negative values on the dependent variable should be rejected when trying to fit a Poisson regression model. As a result, if all of the cases in your dataset have negative or non-integer values on the dependent variable, I would expect you to receive the sort of error message that you received. No restrictions are placed on the predictors in a generalized linear model. Ryan On Wed, Mar 16, 2011 at 2:30 PM, Andrew Lawrence <[hidden email]> wrote: > Dear SPSS-listers, > > I'm afraid this is dual question: "what to do" & "how to do it in SPSS > (if possible)". I would greatly appreciate any advice you might have on > the following problem: > > I'm dealing with a variable (I shall call X) with a clear Poisson-type > distribution: X is a count of discrete incidences per individual, > observed over a sample of individuals, thus the value of X can be zero > or any positive integer. There are many zeros (~60% of observations are > zeros) and there is an extreme positive skew - the maximum observed X > count is ~150, but only 10% of individuals have X-values >10. > > I've explored the idea of transforming the X-data (log, square root or > rank) and found that there is still an uncomfortable deviation from > normality in each of the cases. Throwing out the zero-data helps, but > theoretically this doesn't make a lot of sense and reduces my sample > size by 60%. > > I am interested in the relationship between this X variable and a > normally distributed outcome variable (I shall call Y). Eventually I > will want to include some nuisance covariates in the model. > > If the X-count variable was Gaussian I would use an OLS regression to > regress X onto Y. I could then conclude whether an increased count of X > is associated with the observed values of Y. However I'm not sure that > this is the correct approach given the non-normality of X. If my > hypothesis was the other way around (i.e. predicting X from Y) I would > run a Poisson regression to assess the relationship, but I need to > predict Y from X, and the SPSS GLM Poisson regression will not run with > the variables entered this way around. When I tried a poisson regression > with Y as the dependent and X as the predictor I received the error > message: "There are no valid cases. Statistics cannot be computed. > Execution of this command stops." . > > Does anyone on the list have experience with a similar analysis? I'm a > bit confused as to the best way to approach this, is the problem > technical: in that SPSS isn't able to run the poisson regression the > other way round, or is it that the poisson regression can be performed > one way around and it's answers inferred regarding the inverse? (i.e. > knowing what the coefficients for X on Y controlling for Z are from the > coefficients of Y on X controlling for Z). A third alternative: is > Poisson regression completely the wrong way to take this analysis. Is > there a better way? > > Thanks in advance, > > ~Andrew > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Andrew Lawrence-2
Dear Andrew
You may not need to transform your X variable at all. The distribution of Y is more important, and in fact the critical assumption of OLS regression is that the residuals are iid, the distribution of X is irrelevant. I suggest you run your OLS and check the residuals for normality and heteroscedasticity. If the residuals look OK, you can conclude you have not violated the regression assumptions. Garry Gelade Business Aanalytic Ltd -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andrew Lawrence Sent: 16 March 2011 18:31 To: [hidden email] Subject: Poisson Variable help Dear SPSS-listers, I'm afraid this is dual question: "what to do" & "how to do it in SPSS (if possible)". I would greatly appreciate any advice you might have on the following problem: I'm dealing with a variable (I shall call X) with a clear Poisson-type distribution: X is a count of discrete incidences per individual, observed over a sample of individuals, thus the value of X can be zero or any positive integer. There are many zeros (~60% of observations are zeros) and there is an extreme positive skew - the maximum observed X count is ~150, but only 10% of individuals have X-values >10. I've explored the idea of transforming the X-data (log, square root or rank) and found that there is still an uncomfortable deviation from normality in each of the cases. Throwing out the zero-data helps, but theoretically this doesn't make a lot of sense and reduces my sample size by 60%. I am interested in the relationship between this X variable and a normally distributed outcome variable (I shall call Y). Eventually I will want to include some nuisance covariates in the model. If the X-count variable was Gaussian I would use an OLS regression to regress X onto Y. I could then conclude whether an increased count of X is associated with the observed values of Y. However I'm not sure that this is the correct approach given the non-normality of X. If my hypothesis was the other way around (i.e. predicting X from Y) I would run a Poisson regression to assess the relationship, but I need to predict Y from X, and the SPSS GLM Poisson regression will not run with the variables entered this way around. When I tried a poisson regression with Y as the dependent and X as the predictor I received the error message: "There are no valid cases. Statistics cannot be computed. Execution of this command stops." . Does anyone on the list have experience with a similar analysis? I'm a bit confused as to the best way to approach this, is the problem technical: in that SPSS isn't able to run the poisson regression the other way round, or is it that the poisson regression can be performed one way around and it's answers inferred regarding the inverse? (i.e. knowing what the coefficients for X on Y controlling for Z are from the coefficients of Y on X controlling for Z). A third alternative: is Poisson regression completely the wrong way to take this analysis. Is there a better way? Thanks in advance, ~Andrew ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Dear Garry,
Thanks for your comment, I think you're absolutely right! I've plotted the residuals and they are well distributed and so the assumptions of OLS regression are not violated. Now I go back and read the relevent sections of a textbook, I see that there is no assumption of conditional normality for X. It seems strange though that the appropriate method if looking at the relationship between the two variables the opposite way around would be Poisson regression. I suppose this must be because the questions are different. Thanks again, Andrew On 17/03/2011 13:18, Garry Gelade wrote: > Dear Andrew > > You may not need to transform your X variable at all. The distribution of Y > is more important, and in fact the critical assumption of OLS regression is > that the residuals are iid, the distribution of X is irrelevant. > > I suggest you run your OLS and check the residuals for normality and > heteroscedasticity. If the residuals look OK, you can conclude you have not > violated the regression assumptions. > > Garry Gelade > Business Aanalytic Ltd > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Andrew Lawrence > Sent: 16 March 2011 18:31 > To: [hidden email] > Subject: Poisson Variable help > > Dear SPSS-listers, > > I'm afraid this is dual question: "what to do"& "how to do it in SPSS > (if possible)". I would greatly appreciate any advice you might have on > the following problem: > > I'm dealing with a variable (I shall call X) with a clear Poisson-type > distribution: X is a count of discrete incidences per individual, > observed over a sample of individuals, thus the value of X can be zero > or any positive integer. There are many zeros (~60% of observations are > zeros) and there is an extreme positive skew - the maximum observed X > count is ~150, but only 10% of individuals have X-values>10. > > I've explored the idea of transforming the X-data (log, square root or > rank) and found that there is still an uncomfortable deviation from > normality in each of the cases. Throwing out the zero-data helps, but > theoretically this doesn't make a lot of sense and reduces my sample > size by 60%. > > I am interested in the relationship between this X variable and a > normally distributed outcome variable (I shall call Y). Eventually I > will want to include some nuisance covariates in the model. > > If the X-count variable was Gaussian I would use an OLS regression to > regress X onto Y. I could then conclude whether an increased count of X > is associated with the observed values of Y. However I'm not sure that > this is the correct approach given the non-normality of X. If my > hypothesis was the other way around (i.e. predicting X from Y) I would > run a Poisson regression to assess the relationship, but I need to > predict Y from X, and the SPSS GLM Poisson regression will not run with > the variables entered this way around. When I tried a poisson regression > with Y as the dependent and X as the predictor I received the error > message: "There are no valid cases. Statistics cannot be computed. > Execution of this command stops." . > > Does anyone on the list have experience with a similar analysis? I'm a > bit confused as to the best way to approach this, is the problem > technical: in that SPSS isn't able to run the poisson regression the > other way round, or is it that the poisson regression can be performed > one way around and it's answers inferred regarding the inverse? (i.e. > knowing what the coefficients for X on Y controlling for Z are from the > coefficients of Y on X controlling for Z). A third alternative: is > Poisson regression completely the wrong way to take this analysis. Is > there a better way? > > Thanks in advance, > > ~Andrew > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Hello,
Is anyone else experiencing issues while cutting and pasting in SPSS version 19. When I select more than a few records to cut and paste, minutes will pass by before the function works. If I try to cut and paste 100 or more records, everything freezes up and I end up needing to shut down my computer. Cutting and pasting was never an issue in earlier versions. I am looking forward to hearing if anyone else has this problem or knows how to resolve it. Thanks, Resha ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
I have a similar problem with "search and replace"
Arthur Kramer, Ph.D. "Believe half of what you see and none of what you hear." N. Whitfield B. Strong -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kreischer,Resha M Sent: Thursday, March 17, 2011 3:37 PM To: [hidden email] Subject: Trouble With Version 19: Cut/Paste Hello, Is anyone else experiencing issues while cutting and pasting in SPSS version 19. When I select more than a few records to cut and paste, minutes will pass by before the function works. If I try to cut and paste 100 or more records, everything freezes up and I end up needing to shut down my computer. Cutting and pasting was never an issue in earlier versions. I am looking forward to hearing if anyone else has this problem or knows how to resolve it. Thanks, Resha ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Andrew Lawrence-2
Just a note on terminology. Regressing X onto Y means predicting X from Y. Given that you want to predict Y from X, you are regressing Y onto X.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |