|
Hi, Tim,
Count data refers to data that measures the number of certain events or characteristics (eg, # injuries, # infections, # hospitalizations,etc). Often there is an underlying time scale, so the analysis essentially becomes an analysis of rate. I'm not familiar with your K6 scale--perhaps it can be modeled as count data, ie, # psychological distress items. However, I wouldn't make that determination on the basis of the distribution from your particular sample because it may simply reflect the peculiarities of your sample---and not the nature of the construct that K6 measures. In addition, the decision to use a zero inflated model or a negative binomial model instead of a Poisson model should be based on the finding that the assumption of variance(Y)=mean has been violated in the Poisson model---not on the visual inspection of the data. Further, because the Poisson model is nested within the negative binomial model, a test of r=0 would verify the Poisson hypothesis under a NB model fit. The Vuong test can be used to determine whether using a zero inflated model is warranted. If the K6 scale can't reasonably be seen as count data, rather than immediately transforming vairables or going to some exotic model/link function, I would initially simply fit the simple OLS multiple regression model and then examine the residuals. As you know, the assumption regarding normality refers to the residuals---NOT the response or predictor vairables/covariates. If the residuals are reasonably well-behaved (eg, not heteroscedastic), then the OLS model is likely appropriate. If there's a problem with the residuals, then you'll need to dig deeper. Heteroscedasticity? Perhaps, Box-Cox regression is warranted. Nonlinearity? Sometimes restricted cubic splines are helpful---in which the number of knot is determined by the strength of the relationship between the predictors and the response variable---using Spearman r^2. Frank Harrell (Regression Modeling Strategies) provides a nice discussion and demonstration of this method. Hope this helps, Scott Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat Professor & Director of Research Dept of Physical Medicine & Rehabilitation Wayne State University School of Medicine 261 Mack Blvd Detroit, MI 48201 Email: [hidden email] Tel: 313-993-8085 Fax: 313-966-7682 --- On Sun, 10/26/08, Tim Hale <[hidden email]> wrote: > From: Tim Hale <[hidden email]> > Subject: Re: st: suggested references about the variables to include in zero-inflated portion of zinb? > To: [hidden email] > Date: Sunday, October 26, 2008, 9:28 AM > Scott, > > Actually, I haven't thought of this type of measure to > be a count > variable. Originally we used OLS but a reviewer suggested > that we > should transform the psychological distress score to > correct for skew > and if the skew is not corrected then we should try nbreg > or zinb (if > the vuong test shows a zero-inflated model to fit best). I > tried > correcting for the skew but sktest, swilk, and sfrancia > indicate the > distribution remains skewed. So, following the reviewers > suggestion, > we are trying zinb. > > I have found a few papers where others have treated the > CES-D as a > count variable. We are using a very similar measure, the K6 > > psychological distress scale. > > Do you have any recommendations about how to handle the > reviewers > suggestions? Perhaps it is still appropriate to use the > log- > psychological distress rather than treating psychological > distress as > a count variable? The results between using OLS and the > natural log of > psychological distress are very similar to nbreg. > > Thank you for help, > Tim > > On Oct 25, 2008, at 1:33 AM, statalist-digest wrote: > > > Date: Fri, 24 Oct 2008 07:23:58 -0700 (PDT) > > From: SR Millis <[hidden email]> > > Subject: Re: st: suggested references about the > variables to include > > in zero-inflated portion of zinb? > > > > Tim, > > > > What is your basis for assuming that your response > variable > > (psychological distress) is a count variable? > > > > > > > > Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat > > Professor & Director of Research > > Dept of Physical Medicine & Rehabilitation > > Wayne State University School of Medicine > > 261 Mack Blvd > > Detroit, MI 48201 > > Email: [hidden email] > > Tel: 313-993-8085 > > Fax: 313-966-7682 > > > > > > - --- On Fri, 10/24/08, Tim Hale > <[hidden email]> wrote: > > > >> From: Tim Hale <[hidden email]> > >> Subject: st: suggested references about the > variables to include in > >> zero-inflated portion of zinb? > >> To: [hidden email] > >> Date: Friday, October 24, 2008, 1:18 AM > >> I am using zinb to estimate level of psychological > distress > >> (scores > >> range from 0-24) using various demographic > variables and > >> measures of > >> use of the Internet. I've used -countfit- to > compare > >> various count > >> models and the results support zinb as the best > fitting > >> model. > >> > >> I am uncertain, however, about how to justify the > variables > >> that I > >> include in the zero-inflated part of the model. > I've > >> read journal > >> articles that have used zinb, read the book by > Freese and > >> Long, and > >> searched the Internet and Statalist but I have not > been > >> able to find > >> any detailed recommendations or procedures. Can > anyone > >> suggest any > >> other sources (books or journals) that provide an > >> explanation or a > >> good example of this process? > >> > >> Ideally I would like to find a good source that I > can cite > >> in the > >> paper -- but I appreciate any suggestions about > this you > >> might have. > >> > >> Thanks for you help, > >> Tim > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
