different result in two nb reg and pisson reg

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

different result in two nb reg and pisson reg

Awan
Hi

I have a question about the differences between negative binominal regression and pisson regression. in my dateset the mean is not equal variance value, so I decided to run a nb reg, but by contrast to what I have heard about the same results for both models, my models copeletley different with each others. all of coeffiecents  are significant in pisson reg, but are not in negetive binominal. I have heard these two model reg, just different thecnicaly and are not different in result, it migh be some values are different but the results are same. but my results models' are different. could you please give me some addvice on this issue?
Awan
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Andy W
The coefficient estimates are not guaranteed to be exactly the same - but in practice they are typically quite close. (The mean function is the same for both equations, but the different variance functions basically weight observations differently.) I suspect the smaller the sample sizes the larger differences might be expected - especially if you have high leverage values in small samples.

The standard errors for negative binomial models will always be larger than for the equivalent Poisson model. So going from statistically significant to not statistically significant is always a possibility. If you post the model coefficients and their standard errors it would be easier to give armchair advice.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Awan
Dear Andy,

Thanks for your advice,I have posted my output  comparing the result of two models in this attachment.
regards to thid fact that mean is not equal to variance in my dateset do you think it is obligation to use NB regression for me?
it should be noted my samples are 450 respandants.

I am really appreciated


comapring_pisson_and_Nb_reg.spv
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Andy W
The increases in the standard errors are larger than I might guess, but off-hand they don't suggest anything inherently strange going on in the estimated models.

A few notes:

 - You have warnings about convergence for both models in the Model Effects table.
 - The intercept for all of the models shown is really large. The variable "time.travel" is coded in time format, so I suspect SPSS is predicting time travel in seconds. May be best to physically change these to integers representing minutes. (This might also help SPSS converge to a solution in the model effects tables)
 - In the negative binomial model you restrict the dispersion parameter to be equal to 1. Most situations I would imagine you want to estimate it from the data, e.g. "DISTRIBUTION=NEGBIN(MLE)"

More generally:

 I can't tell from this data whether you should be fitting a Poisson model at all! It is quite possible a better fitting model is to simply take the log of minutes - if time.travel needs to be transformed at all. You have a restricted range of time.travel from 5 minutes to 4.5 hours, suggesting that a Poisson or Neg Bin model may substantially deviate from the observations -- especially in the tails. [Both will assign mass to the [0 to <= 5] minute range, but with a large mean estimate the mass will be very small.]

Try looking at a histogram of travel time and see you can draw a reasonable fit line for Poisson or a Negative Binomial Model over top of it. [Given no times are below 5 minutes I'm tempted to suggest looking at censored models, but it is speculation given the limited info. you've provided.]

The variance in the DESCRIPTIVES is somewhat misleading for the same time format reason I previously stated. You also have no point mass at 0, so there is nothing wrong off hand in taking the logs (if the log of time travel theoretically makes sense.)

More could said about the models estimated (e.g. it appears you use SPLIT FILE, but you may consider stacking the models and estimating them + interactions all at once). [Also it isn't clear the nature of the categories of time and how they relate to one another.] But hopefully this is enough to chew on for a bit ;)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Rich Ulrich
I have always been most satisfied with a transformation or changed
metric when it is motivated by the natural units of what is being measured.

"time.travel" is the variable for which Andy suggests taking the log.  Log is
sometimes okay for time, but for a fixed distance, the reciprocal of travel time is
"travel speed".   The reciprocal will be a bit stronger than taking the log.

If the distances vary, then maybe this is the odd instance where negative
binomial could be justified.  I don't know what fixing the dispersion to 1.0 does.

--
Rich Ulrich

> Date: Mon, 15 Sep 2014 10:36:12 -0700

> From: [hidden email]
> Subject: Re: different result in two nb reg and pisson reg
> To: [hidden email]
>
> The increases in the standard errors are larger than I might guess, but
> off-hand they don't suggest anything inherently strange going on in the
> estimated models.
>
> A few notes:
>
> - You have warnings about convergence for both models in the Model Effects
> table.
> - The intercept for all of the models shown is really large. The variable
> "time.travel" is coded in time format, so I suspect SPSS is predicting time
> travel in seconds. May be best to physically change these to integers
> representing minutes. (This might also help SPSS converge to a solution in
> the model effects tables)
> - In the negative binomial model you restrict the dispersion parameter to
> be equal to 1. Most situations I would imagine you want to estimate it from
> the data, e.g. "DISTRIBUTION=NEGBIN(MLE)"
>
> More generally:
>
> I can't tell from this data whether you should be fitting a Poisson model
> at all! It is quite possible a better fitting model is to simply take the
> log of minutes - if time.travel needs to be transformed at all. You have a
> restricted range of time.travel from 5 minutes to 4.5 hours, suggesting that
> a Poisson or Neg Bin model may substantially deviate from the observations
> -- especially in the tails. [Both will assign mass to the [0 to <= 5] minute
> range, but with a large mean estimate the mass will be very small.]
>
> Try looking at a histogram of travel time and see you can draw a reasonable
> fit line for Poisson or a Negative Binomial Model over top of it. [Given no
> times are below 5 minutes I'm tempted to suggest looking at censored models,
> but it is speculation given the limited info. you've provided.]
>
> The variance in the DESCRIPTIVES is somewhat misleading for the same time
> format reason I previously stated. You also have no point mass at 0, so
> there is nothing wrong off hand in taking the logs (if the log of time
> travel theoretically makes sense.)
>
> More could said about the models estimated (e.g. it appears you use SPLIT
> FILE, but you may consider stacking the models and estimating them +
> interactions all at once). [Also it isn't clear the nature of the categories
> of time and how they relate to one another.] But hopefully this is enough to
> chew on for a bit ;)
>
>
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Andy W
I'm not quite sure what "The reciprocal will be a bit stronger than taking the log" means. I suggested log because it results in a similar model to that of the Poisson regression i.e. for Poisson regression the equation is the log of the expected value of Y:

log(E[Y|X]) = B*X

Whereas for a regression with the dependent variable logged it is the expected value of the transformed variable:

E[log(Y)|X] = B*X

Unfortunately these won't results in the same estimates -- but I hope people see the resemblance!

Like I said it is a bit arm-chair to be suggesting models without more knowledge (I doubt someone is travelling the same distance for times ranging between 5 minutes and 4.5 hours though! The variables clearly suggest we are dealing with humans, not animal measurements). Typically the researcher can specify a generic theory of the form Y = f(X) - that is the outcome is some function of a set of explanatory variables. The form of the function can be guided by prior research and/or purely data driven curve fitting.

Poisson and negative binomial models are standard fare for models of counts and rates -- especially when they have low counts. These models have positive probability for all positive values though, hence my hesitation of its appropriateness in this situation in which there are no zero values of travel times.  If this was say a survey and it asked "For the last 10 hours you have driven, how many minutes were travelling to work?" then this is likely not appropriate for Poisson, as it has a cap at 10 hours (and I would expect many people to cap out near the 10 hour mark). If it was a survey and asked "In the past week, how many minutes have you spent travelling to work?" - this could plausibly be appropriate for a Poisson model. (The choice between Poisson and Neg. Bin. is typically data driven.)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Awan
In reply to this post by Rich Ulrich
Dear Rich,

Thanks for your advice, I used log for both models(Possin and NB reg) also, as you and Andy suggested I transformed the variable of time travel to second, but the results are not different from the last results, still, two model resuls' are not same at all and Sd error in binominal  is still very larger than possin model same last results.
I confece I am not proffesional in SPSS and not familer to many of special words you used such as reciprocal,etc,

furthermore, it should be mentioned that I want to know which factor affect time spent travelling for work,leisure and shopping trips and I do not have any data on distance. for exmaple weman are spent less time travelling to work more than men, etc,
Also, I draw a histogram graph as Andy suggest, but cannot make distinguesh which model has better fitting.  
 
thanks,
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Awan
In reply to this post by Andy W

Thanks for your advices andy,
 as I wrote in prevoius  comment, I want to know which factor affect time spent travelling for work,leisure and shopping trips and I do not have any data on distance. for exmaple weman are spent less time travelling to work more than men, etc.

 450 respondants have been asked how long your trip takes? trips including work, leisure and shopping.  morever, they were asked to fill their dairy day in a day which they travel. each person  have at least 2 trips per day. going to a destination ( e.g. work) and coming back home are two trips. and each of trip may take from 1 minute to .... there was no sample who spent less than 5 munite for a trip.
by this detail, whould you please lead me which model is better?

Also, I draw a histogram graph as you suggested, but I cannot make distinguesh which model has better fitting.  
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Andy W
Can you post the histograms of the original travel.time variable? (Not transformed)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Awan
comapring_pisson_and_Nb_reg.spv


Indeed I wathced a video in youtube that shows  after runing two models you can use the graph for making distiction between two models. I sent it you also

Thanks again,
 
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Andy W
As an exercise in curve fitting, I don't think you are likely to get a good fit from a count model. You can interactively superimpose different distributions on your histograms in SPSS and transform the X axis (e.g. log, power to .5 which is the square root). It is not out of the realm of possibility you could get a reasonable fit from a negative binomial model, but I'm very skeptical it would be possible with a Poisson model. The square root of the travel times seems sufficient to make the histograms pretty symmetric looking (and make predictions towards negative time values very unlikely within the sample), but I imagine you could make economic arguments for other transformations.

The residual plots don't look too bad -- but they are a bit more difficult to interpret for generalized linear models. I don't know what YouTube video you refer to. A good plot for any model is to superimpose the predicted distribution with the observed data, and I have an example of that here for negative binomial models, http://andrewpwheeler.wordpress.com/2014/02/17/negative-binomial-regression-and-predicted-probabilities-in-spss/, but that will be a bit annoying to extend to thousands of integer values.

Also note I said to transform time.travel to minutes - I imagine they were treated like seconds in the model already. This shouldn't affect the coefficient estimates, but (I think) it should make the intercept smaller by a factor of log(1/60), about [-4.1].
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and pisson reg

Rich Ulrich
In reply to this post by Andy W
Oh - I think everyone concerned with power transformations should have a
proper orientation to the notion of "strength".  Look at the transformation of
   x' = x**k        (New) x' is equal to x raised to the k.

For k=1, this is the identity "transformation": no transformation at all.
The stronger transformation is whatever is further from 1.0.  For social
sciences, the transformations are usually less than 1. 
  k= 0.5  is taking the square root, which normalizes a Poisson.
  k= 0.0  is (asymptotically) taking the log, which normalizes log-normal.
  k= -1   is taking the reciprocal, which is equivalent to flipping a ratio a/b  to b/a;
that is the simplest justification.  It also seems to work fairly often for distances.

Thus: If taking the log isn't strong enough to bring in the stretched-out tail, you can
try the reciprocal as a stronger option.  You can also use log-log plots of two variables
in order to estimate the power needed for a linear relation between quantities, but I
think of that as more common in physics than in the social sciences.

For any power transformation, it is important that the zero is functioning as "zero",
so it is sometimes important to start by subtracting x  from the maximum value of x,
or otherwise re-center it.  If that is not problematic, is usually pretty easy to see (from
plots) which of those three transformations gives most symmetry. 

--
Rich Ulrich




> Date: Tue, 16 Sep 2014 05:26:50 -0700
> From: [hidden email]
> Subject: Re: different result in two nb reg and pisson reg
> To: [hidden email]
>
> I'm not quite sure what "The reciprocal will be a bit stronger than taking
> the log" means.
...

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: different result in two nb reg and Poisson reg

Jon K Peck
aka Tukey Ladder of Powers for transforming a distribution to normality (not always possible).  Googling for this turns up many references.

In Statistics, the ADP procedure will calculate the optimal power for you using the Box-Cox transformation (Transform > Prepare Data for Modeling > Interactive (or Automatic) then Rescale Fields

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621


The Box-Cox transformation exploits this to find a normality-inducing power on the ladder (not always possible).
 

From:        Rich Ulrich <[hidden email]>
To:        [hidden email]
Date:        09/16/2014 10:45 AM
Subject:        Re: [SPSSX-L] different result in two nb reg and pisson reg
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Oh - I think everyone concerned with power transformations should have a
proper orientation to the notion of "strength".  Look at the transformation of
  x' = x**k        (New) x' is equal to x raised to the k.

For k=1, this is the identity "transformation": no transformation at all.
The stronger transformation is whatever is further from 1.0.  For social
sciences, the transformations are usually less than 1.  
 k= 0.5  is taking the square root, which normalizes a Poisson.
 k= 0.0  is (asymptotically) taking the log, which normalizes log-normal.
 k= -1   is taking the reciprocal, which is equivalent to flipping a ratio a/b  to b/a;
that is the simplest justification.  It also seems to work fairly often for distances.

Thus: If taking the log isn't strong enough to bring in the stretched-out tail, you can
try the reciprocal as a stronger option.  You can also use log-log plots of two variables
in order to estimate the power needed for a linear relation between quantities, but I
think of that as more common in physics than in the social sciences.

For any power transformation, it is important that the zero is functioning as "zero",
so it is sometimes important to start by subtracting x  from the maximum value of x,
or otherwise re-center it.  If that is not problematic, is usually pretty easy to see (from
plots) which of those three transformations gives most symmetry.  

--
Rich Ulrich




> Date: Tue, 16 Sep 2014 05:26:50 -0700
> From: [hidden email]
> Subject: Re: different result in two nb reg and pisson reg
> To: [hidden email]
>
> I'm not quite sure what "The reciprocal will be a bit stronger than taking
> the log" means.
...

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD