Posted by
Hector Maletta on
Jan 23, 2007; 6:35pm
URL: http://spssx-discussion.165.s1.nabble.com/Help-with-Binary-Logistic-Regression-tp1073387p1073393.html
Logistic Regression is indeed usable as a (probabilistic) predictor for
individual cases, but "probabilistic" is the crucial word. Probability is
governed by the Law of Large Numbers, and anything it says about groups is
subject to large margins of error when applied to specific individual cases.
Thus Winston Churchill, who smoked heavily, led a sedentary life in his
mature years, was seriously overweight, was subject to years of constant
occupational stress and short sleep hours, and drank half a bottle of brandy
every day plus generous doses of other liquors, should have died before
making 60 but managed to survive to almost 90 against all odds, while the
inventor of aerobic exercise, I don't even remember his name now, died of a
heart attack while exercising in his fifties, in two fine examples of
probability going afoul when applied to individual cases. Take, however,
1000 people like Churchill or 1000 like the aerobic guy, and the odds will
not fail. Another example is Albert Einstein: barely passing high school,
was judged not to be university material, and only made it to a vocational
polytechnical school, ending up as a clerk in a patent office; in a
meritocratic system based on SAT and A-level exams he would have been judged
(as he was, on a less scientific basis) as unfit for any kind of academic
career. A predictor equation for adult academic success based on early
scholastic achievements would have predicted a round zero for poor Albert,
again showing the perils of applying probabilistic predictions to
individuals.
Hector
_____
De: Cardiff Tyke [mailto:
[hidden email]]
Enviado el: 23 January 2007 11:29
Para: Hector Maletta;
[hidden email]
Asunto: Re: Help with Binary Logistic Regression
Thanks for your help. Out of interest, what would be the best statistical
procedure to yield a predictor for individual cases? I thought BLR would be
the best for this sort of exercise.
----- Original Message ----
From: Hector Maletta <
[hidden email]>
To:
[hidden email]
Sent: Tuesday, 23 January, 2007 1:12:09 PM
Subject: Re: Help with Binary Logistic Regression
The prediction is made by SPSS using a cutoff point for the
predicted probability, which by default is 0.50. In other words, cases with
p>0.5 are predicted for Yes, and those up to 0.5 for No. Since you have in
general p=0.65, I guess your predictors do not dent much the average
probability, so most cases end up with probabilities above 0.5 and are
therefore predicted to suffer the event.
You may use syntax to change the cutoff point, if desired (putting
it at 0.65, for instance, or using ROC curves first to find the most
suitable cutoff point). However, Logistic Regression is best used as an
analytical tool and as a predictor for sample or population proportions than
as a predictor for individual cases, since it is based on probability
distributions which leave ample room for random effects. A particular case,
even with a high predicted probability, may well avoid suffering the event
(think of all those heavy smokers that live to 90) while others with low
probability suffer it anyway (non-smoker, exercising lean people suffering a
heart attack at 50).
Hector
-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:
[hidden email]] En nombre de
Cardiff Tyke
Enviado el: 23 January 2007 08:32
Para:
[hidden email]
Asunto: Help with Binary Logistic Regression
All,
I'm a newcomer to SPSS (hence, I don't understand most of the
questions posted to this mailing list!) and I would like to ask members for
help with a simple problem.
I'm currently running a binary logistic regression procedure to try
and predict a simple yes or no response from subjects. I have roughly 20000
records which each have around 30 variables. The yes:no split of existing
records is currently around 65%:35%.
However, when I run the SPSS regression procedure, the model
predicts that all respondents will return a "yes" answer. I realise that
this could be down to purely having a set of unpredictive variables
(although I hope not), but I'd like to remove all potential human (i.e mine)
before I jump to any conclusions.
Can anyone offer some simple assistance to help me to get to the
bottom of this problem?
Thanks,
JC
___________________________________________________________
New Yahoo! Mail is the ultimate force in competitive emailing. Find
out more at the Yahoo! Mail Championships. Plus: play games and win prizes.
http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk<
http://uk.rd.yahoo.com/evt=44106/*http:/mail.yahoo.net/uk>
_____
Now you can scan
<
http://us.rd.yahoo.com/mail/uk/taglines/default/nowyoucan/reading_pane/*http:/us.rd.yahoo.com/evt=40565/*http:/uk.docs.yahoo.com/nowyoucan.html>
emails quickly with a reading pane. Get the new Yahoo!
<
http://us.rd.yahoo.com/mail/uk/taglines/default/nowyoucan/reading_pane/*http:/us.rd.yahoo.com/evt=40565/*http:/uk.docs.yahoo.com/nowyoucan.html> Mail.