Logistic regression issue I thought I understood

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Logistic regression issue I thought I understood

Maguin, Eugene
All,

In the course of doing a logistic regression, I ran across something
unexpected and which I thought I understood.

My command syntax is

logistic regression complete with gpra asi gpra by asi/categorical gpra asi/
   contrast(gpra)=indicator(1)/contrast(asi)=indicator(1)/
   enter gpra asi/enter gpra by asi.

Predictors are dichotomous.

The block 1 output is
                        Variables in the Equation
                B               S.E.    Wald    df      Sig.    Exp(B)
GPRA(1) 1.161           .495    5.510   1       .019    3.193
ASI(1)  .646            .474    1.857   1       .173    1.907
Constant        -2.029  .496    16.71   1       .000    .132

The block 2 output is
                        Variables in the Equation
                B               S.E.            Wald    df      Sig.
Exp(B)
GPRA(1) 20.915  8204.359        .000    1       .998    1.212E9
ASI(1)  20.510  8204.359        .000    1       .998    8.077E8
GPRA(1) by
 ASI(1) -20.733 8204.359        .000    1       .998    .000
Constant        -21.203 8204.359        .000    1       .998    .000

Ok, nothing special. SEs explode. Syntax error? No, Seems right. Possibly
collinearity? However, I think not. The crosstab of GPRA by ASI has non zero
values in all cells and the chi-square is not significant. Instead, the one
odd thing is that in the three variable crosstab of GPRA, ASI and the DV,
complete, one of the cells has a zero value. I had thought that in this sort
of model with an interaction, logistic regression would work with zero cases
in the described cell. Bad understanding on my part??

If so, is there something that can be done to work around the data and get
an esitmate? If the zero cell is the problem, the only fix would seem to
change data values to get some minimal number of cases in the now-zero cell.


Thanks, Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Logistic regression issue I thought I understood

Hector Maletta
Gene,
First let's think what an interaction means in a dichotomous-variable
situation. In your example, the effect of GPRA and ASI on COMPLETE, in the
first model, is additive: each variable adds a contribution to the odds of
the event of being COMPLETE. In the second model, there is an additional
twist: an additional effect of GPRA depending on whether ASI is 0 or 1. In
other words, ASI has a direct effect (added to the effect of GPRA) and an
additional effect reinforcing or attenuating the effect of GPRA.
Now, if one of the cells of the 3-way table is zero, there is no way of
knowing the different effect of GPRA in the presence vs the absence of ASI.
One essential piece of information is missing. I have not worked out the
maths of the algorithm, but these results suggest the kind of problems
arising with near-singular matrices. Possibly the addition of the
interaction term, given the zero cell, makes the matrix near singular
yielding very unstable results.
My first impression is that the interaction model is not working, probably
because of the zero cell. Unless you have powerful theoretical reasons to
suspect an interaction is present I would advice abandoning the interaction
model; if you have such reasons, you may draw a larger random sample to have
sufficient cases in all cells, but even so the stubborn data may insist that
the interaction model is nonetheless wrong.
Hector
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: 26 April 2009 13:22
To: [hidden email]
Subject: Logistic regression issue I thought I understood

All,

In the course of doing a logistic regression, I ran across something
unexpected and which I thought I understood.

My command syntax is

logistic regression complete with gpra asi gpra by asi/categorical gpra asi/
   contrast(gpra)=indicator(1)/contrast(asi)=indicator(1)/
   enter gpra asi/enter gpra by asi.

Predictors are dichotomous.

The block 1 output is
                        Variables in the Equation
                B               S.E.    Wald    df      Sig.    Exp(B)
GPRA(1) 1.161           .495    5.510   1       .019    3.193
ASI(1)  .646            .474    1.857   1       .173    1.907
Constant        -2.029  .496    16.71   1       .000    .132

The block 2 output is
                        Variables in the Equation
                B               S.E.            Wald    df      Sig.
Exp(B)
GPRA(1) 20.915  8204.359        .000    1       .998    1.212E9
ASI(1)  20.510  8204.359        .000    1       .998    8.077E8
GPRA(1) by
 ASI(1) -20.733 8204.359        .000    1       .998    .000
Constant        -21.203 8204.359        .000    1       .998    .000

Ok, nothing special. SEs explode. Syntax error? No, Seems right. Possibly
collinearity? However, I think not. The crosstab of GPRA by ASI has non zero
values in all cells and the chi-square is not significant. Instead, the one
odd thing is that in the three variable crosstab of GPRA, ASI and the DV,
complete, one of the cells has a zero value. I had thought that in this sort
of model with an interaction, logistic regression would work with zero cases
in the described cell. Bad understanding on my part??

If so, is there something that can be done to work around the data and get
an esitmate? If the zero cell is the problem, the only fix would seem to
change data values to get some minimal number of cases in the now-zero cell.


Thanks, Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD