Hello everybody,
I run a negative binomial regression analysis using SPSS, and then did the same in STATA. I used the default options, I guess, and got different results in SPSS and in STATA. For example, some of the coefficients are significant in SPSS (p<0.05), but not in STATA. Could anyone help me? Why could this happen? I've used the same dataset in both cases... Thanks a lot! |
So, assuming that you used the same data properly
each time, you have demonstrated that the two programs seem to have different default options. The obvious first step is to read up on what the options are, and run ones that match. If you aren't sure what they mean, you *can* run three or four or ten different ways... and possibly learn something about the possibilities. Then, if you want to post a question, you can post the syntax and some useful part of the results. -- Rich Ulrich ---------------------------------------- > Date: Sat, 25 Jan 2014 09:16:47 -0800 > From: [hidden email] > Subject: Negative binomial regression analysis: Different results in SPSS and STATA? > To: [hidden email] > > Hello everybody, > > I run a negative binomial regression analysis using SPSS, and then did the > same in STATA. I used the default options, I guess, and got different > results in SPSS and in STATA. > > For example, some of the coefficients are significant in SPSS (p<0.05), but > not in STATA. > > Could anyone help me? Why could this happen? I've used the same dataset in > both cases... > > Thanks a lot! > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Student073
Dear ______, There are many possibilities, including issues such as the N.B. parameterization, estimation method employed, test statistics, convergence criteria... I know virtually nothing about STATA, so I cannot comment further. Many years ago I compared SAS and SPSS NB regression analyses on simulated data and have found very similar results. But, I made sure the same N.B. parameterization was being employed, test statistic, etc. before comparing results. Ryan On Sat, Jan 25, 2014 at 12:16 PM, Student073 <[hidden email]> wrote: Hello everybody, |
In reply to this post by Student073
Rich, Ryan, thanks you!
These are the results I got. I tested the moderating effect of "moderator" on the relation between IV and DV. Although none of the analyses showed a significant result, I'm still alarmed the results were so different... Even in SPSS, there's a considerable difference in the significance levels if I use the "model-based estimators" compared to the "robust estimation". What should I do to get the more accurate outcome??? Thanks again!! STATA . nbreg DVcount Controlvar1 Controlvar2 IV Moderator IVxModer Fitting Poisson model: Iteration 0: log likelihood = -55609.736 Iteration 1: log likelihood = -45032.466 (backed up) Iteration 2: log likelihood = -27555.08 (backed up) Iteration 3: log likelihood = -19627.822 Iteration 4: log likelihood = -8845.6562 Iteration 5: log likelihood = -8488.7504 Iteration 6: log likelihood = -8388.2676 Iteration 7: log likelihood = -8388.1451 Iteration 8: log likelihood = -8388.1451 Fitting constant-only model: Iteration 0: log likelihood = -5506.8228 Iteration 1: log likelihood = -4983.0922 Iteration 2: log likelihood = -3158.2798 Iteration 3: log likelihood = -3157.9536 Iteration 4: log likelihood = -3157.9535 Fitting full model: Iteration 0: log likelihood = -3078.1447 Iteration 1: log likelihood = -3027.5247 Iteration 2: log likelihood = -3024.3007 Iteration 3: log likelihood = -3024.2785 Iteration 4: log likelihood = -3024.2785 Negative binomial regression Number of obs = 5447 LR chi2(5) = 267.35 Dispersion = mean Prob > chi2 = 0.0000 Log likelihood = -3024.2785 Pseudo R2 = 0.0423 ------------------------------------------------------------------------------ DVcount | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Controlvar1 | .4006587 .1612887 2.48 0.013 .0845387 .7167787 Controlvar2 | .0192326 .0034626 5.55 0.000 .012446 .0260192 IV | .0699464 .0091503 7.64 0.000 .0520122 .0878806 Moderator | .2164698 .0293511 7.38 0.000 .1589428 .2739968 IVxModer | -.0018834 .003332 -0.57 0.572 -.0084139 .0046471 _cons | -1.74222 .1218459 -14.30 0.000 -1.981034 -1.503407 -------------+---------------------------------------------------------------- /lnalpha | 2.803212 .0562555 2.692953 2.913471 -------------+---------------------------------------------------------------- alpha | 16.49756 .9280785 14.77525 18.42063 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 1.1e+04 Prob>=chibar2 = 0.000 SPSS MODEL BASED ESTIMATOR * Generalized Linear Models. GENLIN DVcount BY Controlvar1 (ORDER=ASCENDING) WITH Controlvar2 IV Moderator IVxModerator /MODEL Controlvar1 Controlvar2 IV Moderator IVxModerator INTERCEPT=YES DISTRIBUTION=NEGBIN(1) LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION. ROBUST ESTIMATION * Generalized Linear Models. GENLIN DVcount BY Controlvar1 (ORDER=ASCENDING) WITH Controlvar2 IV Moderator IVxModerator /MODEL Controlvar1 Controlvar2 IV Moderator IVxModerator INTERCEPT=YES DISTRIBUTION=NEGBIN(1) LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=ROBUST MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION. NBR_SPSS.doc |
In reply to this post by Student073
PS: I posted the last analysis I run, where none of the moderating effects were significant. Before, I run other analyses with different variables and had the same problem I have just mentioned... Using SPSS (and the model based estimator), the interaction term was significant. However, in STATA it was not significant... I'd be very grateful if you could clarify what "default" options are the best ones to get the most accurate results.
THANKS! |
In reply to this post by Student073
Dear no-name <first names are always welcome!>, I really do not have time to investigate. Moreover, I know nothing about STATA as I mentioned before. Having said that, I did notice something in your SPSS GENLIN code with which I generally disagree. You have forced the dispersion parameter to be 1.0. Instead, allow the dispersion parameter to be estimated by changing:
NEGBIN(1) to NEGBIN(MLE) I bet STATA estimates the dispersion parameter. Also, since your first covariate is binary (coded 0/1, I believe), change this line:
GENLIN DVcount BY Controlvar1 (ORDER=ASCENDING) WITH Controlvar2 IV Moderator IVxModerator to GENLIN DVcount WITH Controlvar1 Controlvar2 IV Moderator IVxModerator
The line above suggests to me that all of your predictors are continuous and/or dichotomous (coded 0/1). One last point: Did you create the interaction term outside of GENLIN? You can construct the interaction term within GENLIN--not that it should really make a difference.
Even after all of these changes, your results may not be identical due to other issues I mentioned in a previous post. Still, my guess is they'll be much closer. Ryan
On Sun, Jan 26, 2014 at 6:50 AM, Student073 <[hidden email]> wrote: Rich, Ryan, thanks you! |
Free forum by Nabble | Edit this page |