Dear List,
Many thanks. |
Kathryn, Lots could be said but before we get too far down this road, exactly what is your response variable (e.g., # of times or days purged) and is there an absolute upper limit. Also, what does the shape of the distribution look like for the non-zero values? Another question, could it be argued that everybody in your sample is "at risk" of scoring a 1 or higher? Ryan
|
Kathryn,
If you take the mean of 3 questions, wouldn't it most certainly be possible to end up with non-integer values? I do not see how a count regression model, poisson or negative binomial, would be appropriate here.
By the way, although you stated that "the distributions are be shown below," I do not see any illustration. Anyway, let's assume for the moment that you do in fact have count data for your dependent variable that could range from 0 to positive infinity.
As I'm sure you are aware, a Poisson regression assumes the expected value is equal to the variance. Often, this assumption is not tenable; that is, there is a greater variance than the expected value (overdispersion). Of course, there is the possibility of underdispersion as well. Determining the cause of overdispersion or underdispersion is not always straightforward, and there are several ways to account for it, one of which is to fit a model which relaxes the assumption that the mean equals the variance (a.k.a. negative binomial).
One could fit a standard Poisson regression and fix the scale parameter to 1.0 via GENLIN, and then fit a negative binomial model which allows the scale parameter to be freely estimated. In addition to examining the scale parameter, since the Poisson regression is nested in the negative binomial regression, one could also construct a likelihood ratio test. There are other options in SPSS that I'll skip over for the moment (e.g., fitting an overdispersed Poisson regression model--specifying a Poisson regression in GENLIN but then allowing the scale parameter to be estimated).
There are still many questions that remain unanswered. First and foremost, do you actually have count data? Second, what do those distributions look like. If you have a spike at zero for both with right skew for the positive values, then you might consider a zero-inflated model. Unfortunately, as far as I'm aware, SPSS is not capable of fitting zero-inflated poisson or negative binomial models. Third, it sounds like there is an absolute upper limit for your dependent variables. This is not consistent with the Poisson or NB distributions which would likely assume a non-zero probability of obtaining values greater than the actual max value. Fourth, from your description, I wouldn't be surprised if both dependent variables are correlated. Fitting a single model which allows for correlation between the dependent variables might be considered.
Ryan On Mon, Nov 21, 2011 at 12:33 PM, Kathryn Gardner <[hidden email]> wrote:
|
sorry I meant the sum of the items, not the mean, so both variables
include only integers. Both distributions have a spike at zero with
right skew for the
positive values, hence the reason I thought I needed a zero-inflated
model, but you have confirmed my suspicions that this is not available
in SPSS. That said, there probably is an absolute upper limit for the
dependent variable given the time frame of the past 3 months (there is
probably a max. no. of times a person could engage in these behaviours,
even if they were at the extreme end). In that case NBR isn't the right
model then it seems. I hadn't considered this. I did want to run OLS
regression, but they were so skewed and overdispersed with lots of zeros
that I had to look into other procedures. I did try statistically
transforming both variables, but the transformations did not correct the
problem.
Kathryn Date: Fri, 25 Nov 2011 07:58:55 -0500 From: [hidden email] Subject: Re: Negative Binomial Regression To: [hidden email] Kathryn, If you take the mean of 3 questions, wouldn't it most certainly be possible to end up with non-integer values? I do not see how a count regression model, poisson or negative binomial, would be appropriate here.
By the way, although you stated that "the distributions are be shown below," I do not see any illustration. Anyway, let's assume for the moment that you do in fact have count data for your dependent variable that could range from 0 to positive infinity.
As I'm sure you are aware, a Poisson regression assumes the expected value is equal to the variance. Often, this assumption is not tenable; that is, there is a greater variance than the expected value (overdispersion). Of course, there is the possibility of underdispersion as well. Determining the cause of overdispersion or underdispersion is not always straightforward, and there are several ways to account for it, one of which is to fit a model which relaxes the assumption that the mean equals the variance (a.k.a. negative binomial).
One could fit a standard Poisson regression and fix the scale parameter to 1.0 via GENLIN, and then fit a negative binomial model which allows the scale parameter to be freely estimated. In addition to examining the scale parameter, since the Poisson regression is nested in the negative binomial regression, one could also construct a likelihood ratio test. There are other options in SPSS that I'll skip over for the moment (e.g., fitting an overdispersed Poisson regression model--specifying a Poisson regression in GENLIN but then allowing the scale parameter to be estimated).
There are still many questions that remain unanswered. First and foremost, do you actually have count data? Second, what do those distributions look like. If you have a spike at zero for both with right skew for the positive values, then you might consider a zero-inflated model. Unfortunately, as far as I'm aware, SPSS is not capable of fitting zero-inflated poisson or negative binomial models. Third, it sounds like there is an absolute upper limit for your dependent variables. This is not consistent with the Poisson or NB distributions which would likely assume a non-zero probability of obtaining values greater than the actual max value. Fourth, from your description, I wouldn't be surprised if both dependent variables are correlated. Fitting a single model which allows for correlation between the dependent variables might be considered.
Ryan On Mon, Nov 21, 2011 at 12:33 PM, Kathryn Gardner <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |