Re: st: suggested references about the variables to include in zero-inflated portion of zinb?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: st: suggested references about the variables to include in zero-inflated portion of zinb?

SR Millis-3
Hi, Tim,

Count data refers to data that measures the number of certain events or characteristics (eg, # injuries, # infections, # hospitalizations,etc). Often there is an underlying time scale, so the analysis essentially becomes an analysis of rate.

I'm not familiar with your K6 scale--perhaps it can be modeled as count data, ie, # psychological distress items. However, I wouldn't make that determination on the basis of the distribution from your particular sample because it may simply reflect the peculiarities of your sample---and not the nature of the construct that K6 measures.

In addition, the decision to use a zero inflated model or a negative binomial model instead of a Poisson model should be based on the finding that the assumption of variance(Y)=mean has been violated in the Poisson model---not on the visual inspection of the data. Further, because the Poisson model is nested within the negative binomial model, a test of r=0 would verify the Poisson hypothesis under a NB model fit.  The Vuong test can be used to determine whether using a zero inflated model is warranted.

If the K6 scale can't reasonably be seen as count data, rather than immediately transforming vairables or going to some exotic model/link function, I would initially simply fit the simple OLS multiple regression model and then examine the residuals.  As you know, the assumption regarding normality refers to the residuals---NOT the response or predictor vairables/covariates. If the residuals are reasonably well-behaved (eg, not heteroscedastic), then the OLS model is likely appropriate.

If there's a problem with the residuals, then you'll need to dig deeper. Heteroscedasticity? Perhaps, Box-Cox regression is warranted.  Nonlinearity? Sometimes restricted cubic splines are helpful---in which the number of knot is determined by the strength of the relationship between the predictors and the response variable---using Spearman r^2.  Frank Harrell (Regression Modeling Strategies) provides a nice discussion and demonstration of this method.

Hope this helps,
Scott

Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat
Professor & Director of Research
Dept of Physical Medicine & Rehabilitation
Wayne State University School of Medicine
261 Mack Blvd
Detroit, MI 48201
Email:  [hidden email]
Tel: 313-993-8085
Fax: 313-966-7682


--- On Sun, 10/26/08, Tim Hale <[hidden email]> wrote:

> From: Tim Hale <[hidden email]>
> Subject: Re: st: suggested references about the variables to include in zero-inflated portion of zinb?
> To: [hidden email]
> Date: Sunday, October 26, 2008, 9:28 AM
> Scott,
>
> Actually, I haven't thought of this type of measure to
> be a count
> variable. Originally we used OLS but a reviewer suggested
> that we
> should transform the psychological distress score to
> correct for skew
> and if the skew is not corrected then we should try nbreg
> or zinb (if
> the vuong test shows a zero-inflated model to fit best). I
> tried
> correcting for the skew but sktest, swilk, and sfrancia
> indicate the
> distribution remains skewed. So, following the reviewers
> suggestion,
> we are trying zinb.
>
> I have found a few papers where others have treated the
> CES-D as a
> count variable. We are using a very similar measure, the K6
>
> psychological distress scale.
>
> Do you have any recommendations about how to handle the
> reviewers
> suggestions? Perhaps it is still appropriate to use the
> log-
> psychological distress rather than treating psychological
> distress as
> a count variable? The results between using OLS and the
> natural log of
> psychological distress are very similar to nbreg.
>
> Thank you for help,
> Tim
>
> On Oct 25, 2008, at 1:33 AM, statalist-digest wrote:
>
> > Date: Fri, 24 Oct 2008 07:23:58 -0700 (PDT)
> > From: SR Millis <[hidden email]>
> > Subject: Re: st: suggested references about the
> variables to include
> > in zero-inflated portion of zinb?
> >
> > Tim,
> >
> > What is your basis for assuming that your response
> variable
> > (psychological distress) is a count variable?
> >
> >
> >
> > Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat
> > Professor & Director of Research
> > Dept of Physical Medicine & Rehabilitation
> > Wayne State University School of Medicine
> > 261 Mack Blvd
> > Detroit, MI 48201
> > Email:  [hidden email]
> > Tel: 313-993-8085
> > Fax: 313-966-7682
> >
> >
> > - --- On Fri, 10/24/08, Tim Hale
> <[hidden email]> wrote:
> >
> >> From: Tim Hale <[hidden email]>
> >> Subject: st: suggested references about the
> variables to include in
> >> zero-inflated portion of zinb?
> >> To: [hidden email]
> >> Date: Friday, October 24, 2008, 1:18 AM
> >> I am using zinb to estimate level of psychological
> distress
> >> (scores
> >> range from 0-24) using various demographic
> variables and
> >> measures of
> >> use of the Internet. I've used -countfit- to
> compare
> >> various count
> >> models and the results support zinb as the best
> fitting
> >> model.
> >>
> >> I am uncertain, however, about how to justify the
> variables
> >> that I
> >> include in the zero-inflated part of the model.
> I've
> >> read journal
> >> articles that have used zinb, read the book by
> Freese and
> >> Long, and
> >> searched the Internet and Statalist but I have not
> been
> >> able to find
> >> any detailed recommendations or procedures. Can
> anyone
> >> suggest any
> >> other sources (books or journals) that provide an
> >> explanation or a
> >> good example of this process?
> >>
> >> Ideally I would like to find a good source that I
> can cite
> >> in the
> >> paper -- but I appreciate any suggestions about
> this you
> >> might have.
> >>
> >> Thanks for you help,
> >> Tim
> >

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD