SPSSX Discussion

what's my error in these predicted probabilities in nomreg

Classic

List

Threaded

3 messages Options

Stephen Cox-4

what's my error in these predicted probabilities in nomreg

Folks - I am calculating predicted probabilities from a multinomial logistic regression in SPSS V18.

I have run a model, then saved the predicted cell probabilities. I have then run the same model in Stata (and have got the same coefficients), and compared the predicted probabilities generated from SPSS to those produced from Stata. They match, as they should.

I then calculated the predicted probabilities by hand. Using the syntax shown below in SPSS, my calculations of these predicted probabilities do not match those produced by SPSS when I ask for them to saved, as above. When I do this in Stata, my 'hand' calculations match those of Stata, and of SPSS. So something is going wrong when I run my 'hand' calculations in SPSS.

These are my computations. It based on a MNLR with four outcomes, using the last outcome category as the reference category. The first set of compute statements below calculate the logit for each case for each outcome. This based on the general formula of:

Logit_cat = xB

with the predictors being called age, ed, prst, yr89, male and white.

The second set of compute statements calculate the predicted probabilities for each outcome level for each case. This is based on the general formula of:

Prob = exp(logit_cat) / sum of the exp(logit_cat)

Can anyone see why this works in Stata, but not in SPSS?

compute lsd = 0.178 + 0.032*age - 0.144*ed - 0.004*prst + 1.160*yr89 - 1.226*male - 0.834*white.
compute ld = 1.005 + 0.029*age - 0.051*ed - 0.013*prst + 0.426*yr89 - 1.327*male - 0.413*white.
compute la = 1.498 + 0.007*age - 0.033*ed - 0.002*prst + 0.063*yr89 - 0.867*male - 0.300*white.
compute lsa = 0.
execute.

compute probsd = exp(lsd)/(exp(lsd) + exp(ld) + exp(la) + exp(lsa)).
compute probd = exp(ld)/(exp(lsd) + exp(ld) + exp(la) + exp(lsa)).
compute proba = exp(la)/(exp(lsd) + exp(ld) + exp(la) + exp(lsa)).
compute probsa = exp(lsa)/(exp(lsd) + exp(ld) + exp(la) + exp(lsa)).
execute.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Stephen Cox-4

Re: what's my error in these predicted probabilities in nomreg

Folks - I have done more investigation and worked out what is happening,
but not why this makes sense.

Three of the predictors are binary. If I use reverse coding for the
predictor values and use these values in the calculation of the logits I
get the correct logit for each case.

But I do not understand why I should have to do this. It means that to
correctly solve the logit for a case, I have to use values for categorical
predictors that are not those of the case, but the opposite. It appears
that this is what SPSS does when it calculates predicted probabilities.

My assumption is that is because SPSS uses the higher value of
catgegorical predictors as the reference category. In Stata, the lowest
value is used as the reference catgeory in predictors. So in Stata, I can
just calculate the logits by plugging in predictor values as they are. In
SPSS, I have 'reverse coded' all categorical predictors to get the correct
logits in their calculation.

But does this not mean that the use of the higher category as the
reference category is not actually the most 'elegant' approach?

Is there something I am missing?

Sorry for the long rambling emails...

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Alex Reutter

Re: what's my error in these predicted probabilities in nomreg

In the equation below,

compute lsd = 0.178 + 0.032*age - 0.144*ed - 0.004*prst + 1.160*yr89 - 1.226*male - 0.834*white.

I think you are really treating all your predictors as covariates instead of factors; if they are binary coded 0-1 and you want to write the equation this way, you can simply include them as covariates rather than as factors in the model. If you want to treat "male" as a factor, then (assuming that "male" is one of the binary predictors and it is coded 0-1) I would write:

compute lsd = 0.178 + 0.032*age - 0.144*ed - 0.004*prst + 1.160*yr89 - 1.226*(male=0) - 0.834*white.

and treat the other two categorical predictors similarly.

Alex Reutter