SPSSX Discussion

modelling overdispersed binomial data using generalised linear models - quasibinomial model availability in SPSS?

Classic

List

Threaded

1 message

Kavita Thomas

Oct 08, 2024; 9:16am

modelling overdispersed binomial data using generalised linear models - quasibinomial model availability in SPSS?

1 post

I've run a series of tests where each test involves answering 10 questions which are scored correct/wrong, and there are roughly 36 participants per test divided more or less evenly into 3 treatment groups and a control group (categorical predictor). I want to determine whether treatment group scores (the sum of their correct answers which can range from 0 - all wrong - to 10 - all correct) differ significantly from the no treatment control group. There are two covariates: the pretest (binomial) score for the participant and a count variable with no max score (incidence of treatments prior to the test).

My response variables (test scores) are however occasionally overdispersed (Pearson Chi-Square/df is greater than 2 but under 3) but there isn't any design-based reason why the probability of success should differ between trials. Is this amount of overdispersion too much for an ordinary binomial model with logit link to handle? I've read that for overdispersion in these sorts of cases a quasibinomial model would be better but I can't find these options in SPSS. Is it possible to use the quasibinomial in SPSS via syntax options or any other way? (I've never used R before and am afraid the learning curve will be too time consuming.) I also read that the quasibinomial approach is like using a scaled binomial but since I have no idea what the quasibinomial or scaled binomial models entail I don't really understand what this means. Is the scaled binomial approach something that can be done in SPSS? I didn't think it made sense to opt for a negative binomial model since the data is the sum of correct responses in a fixed number of trials.

I also have some questions on running these GLMs in SPSS. For parameter estimation: how does one determine a good max step-halving and max iterations? What kind of estimator should be used for the covariance matrix? What should one set as the maximum Fisher scoring iterations and should the scale parameter method be fixed, based on deviance or Pearson chi-square and what value should be specified? Also for convergence criteria, should change in parameter estimates, change in log-likelihood or Hessian convergence be selected and what should the minimum value provided be? I can't find a guide for how to select these things anywhere online.

When I run these binomial GLMs, I occasionally get the "maximum number of step-halvings was reached but the log-likelihood value cannot be further improved. Output for the last iteration is displayed. The GENLIN procedure continues despite the above warning. Subsequent results shown are based on the last iteration. Validity of the model fit is uncertain" message. Any ideas on how to fix this? I would like to be sure the model is valid.

Any tips on any of these things would be greatly appreciated! Thanks in advance!