SPSSX Discussion

Fw: Logistic Regression fails with empty cell

Classic

List

Threaded

1 message

Martin Holt

Fw: Logistic Regression fails with empty cell

Sorry, Just a quick PS that I should have included just now.

It's 10 per factor on the outcome that is less frequent.

bw,

Martin Holt

----- Forwarded Message ----
From: M HOLT <[hidden email]>
To: "Allan Lundy, PhD" <[hidden email]>; [hidden email]
Sent: Sunday, 13 June, 2010 11:29:05
Subject: Re: Logistic Regression fails with empty cell

Hi Allan,

Remember that it's the **expected** counts that matter, rather than the actual counts.

The following link is excellent, taking you into and through and out the other side on 2x2 tables. It expands on the methods section published in the paper: Campbell Ian, 2007, Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations, Statistics in Medicine, 26, 3661 - 3675.

http://www.iancampbell.co.uk/twobytwo/methods.htm

In a logistic regression it is common to accept "more than 10" per factor in the analysis, yet some, including me, prefer "more than 15". Peduzzi et al ran simulation studies and settled on 10:

Michael A. Babyak. What You See May Not Be What You Get: A Brief,
Nontechnical Introduction to Overfitting in Regression-Type Models.
Psychosom Med 2004 66: 411-421.
and
Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. . A
simulation study of the number of events per variable in logistic
regression analysis.J Clin Epidemiol. 1996 Dec;49(12):1373-9.

I'd concentrate on Ian Campbell's papers and you'll find an answer....but you might not like it :(

Best Wishes,

Martin Holt

From: "Allan Lundy, PhD" <[hidden email]>
To: [hidden email]
Sent: Saturday, 12 June, 2010 22:40:38
Subject: Logistic Regression fails with empty cell

Dear Listers,
First, thanks to Martin Holt, Ryan Black, and Bruce Weaver for helpful comments on another recent logistic regression question.

This one is much more basic, but very surprising (to me, anyway). I have 32 cases, divided into 16 and 16, with a dichotomous outcome. The data look like this:
(Group is A or B; outcome is Yes or No)

       Yes    No
A       16     0
B         6   10

As you might expect, chi-square is highly significant: 14.5, p< .001.

However, using this data in a binomial logistic regression with additional continuous predictor variables yielded weirdly high p values for Group: like p= .996.

I eliminated the continuous predictors, so there was just the dichotomous predictor and dichotomous outcome. Results:
The classification table showed overall correct classification as 81.3%.

But Variable in the equation (Step1) was, for Group:
B= 21.7, S.E.= 10048.2   Sig.= .998.
Obviously the huge SE was what was making it non-significant.

Finally decided the problem had to be the empty cell. I switched one of the outcome values and re-ran

       Yes    No
A       15     1
B         6   10

Results, of course, less significant chi-square, but the following for Group:
B= 3.2, S.E.= 1.2   Sig.= .005.

SPSS Help says, under Data Considerations:

However, your solution may be more stable if your predictors have a multivariate normal distribution. Additionally, as with other forms of regression, multicollinearity among the predictors can lead to biased estimates and inflated standard errors.

Inflated? I guess so, by about 10,000 times! It would have been nice if this section simply said, "Does not work with an empty cell."

Anybody know a way around this problem that won't lose power? Remember, I want to include continuous predictors also. I have not tried it with plain MR, but I don't see why that would be different.

Thanks!

Allan Lundy, PhD
Research Consulting
[hidden email]

Business & Cell (any time): 215-820-8100
Home (8am-10pm, 7 days/week): 215-885-5313
Address: 108 Cliff Terrace, Wyncote, PA 19095
Visit my Web site at www.dissertationconsulting.net ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD