Problem with Binary Logistic regression.

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with Binary Logistic regression.

jaern
Hello, this is a newbie question, so please bear with me.

I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.

Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Hector Maletta

Logistic regression does not predict VALUES of the dependent variable in individual cases. It just estimates the probability of a value (or the odds of a value against another). You are confused by one by product of the SPSS logistic regression procedure, the so-called “classification table”. This table results from betting that all individuals with probability over 50% will have the event, while all individuals with probability <50% will not. In fact, even when you have a collection of individuals with probabilities above 50%, say about 90%, one certain (complementary) percentage of such cases will NOT suffer the event in question. On the contrary, even for a collection of individuals with very low probability, say 10%, some will nonetheless have the outcome in question.

 

Probability can be interpreted as (a) the expected proportion of cases having the event in a large collection of cases; or (b) as a degree of belief that a certain case will turn out to have the event. In the former definition, probability cannot be predicated of individual cases. In the latter, the probability is NOT about the (future) state of the world but about your BELIEF in a certain future state of the world. At any rate, whatever outcome is achieved for a given individual, this individual observation cannot disprove the prediction. You may be an obese sedentary chain smoker with high blood pressure, and still live to be 90, while your lean energetic non-smoking neighbor with pleasantly low blood pressure suddenly dies from a heart attack or a stroke at the tender age of 35. Neither individual outcome disprove the high risk of strokes or heart attacks for obese sedentary smokers with high blood pressure, nor the correspondingly low risk of the opposite kind of fellows. But by betting on the earlier death of one relative to the other you will probably win IN THE LONG RUN (i.e. over a large number of individuals of both groups).

 

The most important result to look up in your log reg output, IMHO, is the Hosmer Lemeshow test, which compares the actual proportion of events in successive deciles of increasing risk. Normally, the two (expected and observed proportions) should not differ by much. The Hosmer Lemeshow test can be analyzed in two versions: the summary chi-square (the lower the better) or the observed vs expected proportions in the ten risk deciles. Both are produced by SPSS.

 

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.

 

Hello, this is a newbie question, so please bear with me.


I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.

 

Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Bruce Weaver
Administrator
In reply to this post by jaern
For a binary outcome variable, the LOGISTIC REGRESSION and NOMREG commands produce the same model (although you may have to change the default setting for the reference category on one of them).  So, I think you need to post your syntax for both models.  If you don't know about how to paste syntax, see the tutorial here:
 
   http://www.cst.cmich.edu/users/lee1c/spss/spss_syntax_editor.htm

HTH.


jaern wrote
Hello, this is a newbie question, so please bear with me.


I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value
(0). In the define data properties window, the unknown value is defined
as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as
covariates. I defined the nominal as categorical in the logistic
regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one
covariate, etc, all the predicted values turn out as 1s... ie. the model
 does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but
 I get the same thing. Only if I perform a multinomial regression, the
results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.

Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Hector Maletta
In reply to this post by jaern

In addition to my previous comment:

 

If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).

The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.

 

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.

 

Hello, this is a newbie question, so please bear with me.


I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.

 

Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

jaern
Thank you for your quick and informative reply.
I think that I will be able to continue without major problems.
The only thing is, that when I performed a multinomial logistic regression, for some reason the 'predicted values' for the group membership end up being almost identical to the values of the variable for each case (perhaps (90% identical), while in binary it's all 0s. That led me to suspect that there was something wrong with the regression procedure. 

Vagelis S.


From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 7:50 PM
Subject: RE: Problem with Binary Logistic regression.

In addition to my previous comment:
 
If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).
The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.
 
Hector
 
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.
 
Hello, this is a newbie question, so please bear with me.

I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.
 
Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11


Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Hector Maletta

If the dependent and independent variables were the same in both cases (and the zero was missing in the dependent variable in both cases), the two procedures are the same and should not differ. However, for binary dependents the adequate tool is (binary) logistic regression.

 

De: Vagelis S. [mailto:[hidden email]]
Enviado el: Saturday, October 01, 2011 15:40
Para: Hector Maletta; [hidden email]
Asunto: Re: Problem with Binary Logistic regression.

 

Thank you for your quick and informative reply.

I think that I will be able to continue without major problems.

The only thing is, that when I performed a multinomial logistic regression, for some reason the 'predicted values' for the group membership end up being almost identical to the values of the variable for each case (perhaps (90% identical), while in binary it's all 0s. That led me to suspect that there was something wrong with the regression procedure. 

 

Vagelis S.

 


From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 7:50 PM
Subject: RE: Problem with Binary Logistic regression.

In addition to my previous comment:

 

If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).

The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.

 

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.

 

Hello, this is a newbie question, so please bear with me.


I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.

 

Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

 


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

jaern
That makes sense. The two procedures are identical as far as the variables go. Actually, I tried virtually every option available in the binary logistic regression dialog box, but couldn't find a way to rectify this. Is there some way to find out what goes wrong in the binary logistic aggression?


From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 8:45 PM
Subject: RE: Problem with Binary Logistic regression.

If the dependent and independent variables were the same in both cases (and the zero was missing in the dependent variable in both cases), the two procedures are the same and should not differ. However, for binary dependents the adequate tool is (binary) logistic regression.
 
De: Vagelis S. [mailto:[hidden email]]
Enviado el: Saturday, October 01, 2011 15:40
Para: Hector Maletta; [hidden email]
Asunto: Re: Problem with Binary Logistic regression.
 
Thank you for your quick and informative reply.
I think that I will be able to continue without major problems.
The only thing is, that when I performed a multinomial logistic regression, for some reason the 'predicted values' for the group membership end up being almost identical to the values of the variable for each case (perhaps (90% identical), while in binary it's all 0s. That led me to suspect that there was something wrong with the regression procedure. 
 
Vagelis S.
 

From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 7:50 PM
Subject: RE: Problem with Binary Logistic regression.
In addition to my previous comment:
 
If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).
The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.
 
Hector
 
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.
 
Hello, this is a newbie question, so please bear with me.

I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.
 
Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
 

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11


Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Hector Maletta

The classification table predicts a value of 1 when the individual estimated probability is above a critical level. The default critical level is 0.5, although users may choose some other critical level (something not advisable: see below). Perhaps you were using different critical levels to “predict” individual outcomes. Perhaps inadvertently the two procedures used different critical values for prediction of the outcome in individual cases.

 

Now, why using other such critical values may be misleading?

 

Predicting “1” when the probability is over 0.5, and “0” when it is below, seems a reasonable choice when the average probability for a randomly chosen individual is around 0.5 (say, within 0.4-0.6), but not when the average probability is more extreme (close to zero or close to one). If the overall probability is, say, 0.8, individuals probabilities will be probably about that average, and perhaps most individuals would be above 0.5 as well; in such situation, SPSS would (wrongly) predict the event (p>0.5) for nearly everyone, when in fact the event did not occur in many cases. The opposite would happen with a low p, say p=0.10: individual probabilities would be clustered around 0.10 and few (if any) would be above 0.5, and therefore SPSS would (wrongly) “predict” that nobody gets the event (zero for nearly everyone).

 

Instead of using a default critical level of 0.5, one may be tempted some “more realistic” critical value. The obvious choice would be using the average probability (say 0.8 or 0.10) as the critical value, but this also leads to error or to paradoxical results. Suppose the average probability is 0.1 and this value is used as the threshold for predicting the event; everyone with p>0.10 would be predicted to get the outcome. Then individuals with probabilities as low as 0.11 would get a “predicted” value of 1 (because they are above the critical value 0.1) in spite of overwhelming odds (89 to 11) that the event would not happen to them. In the opposite case, with p=0.8, someone with p=0.79 would be predicted not to undergo the event, although in fact the occurrence of the event is the most likely result (odds are 79:21). Now, come to think of it, a threshold of 0.5 is ALSO potentially misleading, although more neutrally positioned, when you are trying to predict individual outcomes.

 

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 16:01
Para: [hidden email]
Asunto: Re: Problem with Binary Logistic regression.

 

That makes sense. The two procedures are identical as far as the variables go. Actually, I tried virtually every option available in the binary logistic regression dialog box, but couldn't find a way to rectify this. Is there some way to find out what goes wrong in the binary logistic aggression?

 


From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 8:45 PM
Subject: RE: Problem with Binary Logistic regression.

If the dependent and independent variables were the same in both cases (and the zero was missing in the dependent variable in both cases), the two procedures are the same and should not differ. However, for binary dependents the adequate tool is (binary) logistic regression.

 

De: Vagelis S. [mailto:[hidden email]]
Enviado el: Saturday, October 01, 2011 15:40
Para: Hector Maletta; [hidden email]
Asunto: Re: Problem with Binary Logistic regression.

 

Thank you for your quick and informative reply.

I think that I will be able to continue without major problems.

The only thing is, that when I performed a multinomial logistic regression, for some reason the 'predicted values' for the group membership end up being almost identical to the values of the variable for each case (perhaps (90% identical), while in binary it's all 0s. That led me to suspect that there was something wrong with the regression procedure. 

 

Vagelis S.

 


From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 7:50 PM
Subject: RE: Problem with Binary Logistic regression.

In addition to my previous comment:

 

If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).

The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.

 

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.

 

Hello, this is a newbie question, so please bear with me.


I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.

 

Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

 


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

 


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11

Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

jaern
The classification cutoff in both cases is at 0.5 . I have to say I was tempted to change this but it seemed way too arbritary of a measure, because ofcourse, as you described, if one puts it, say, at .10 then that means that probabilities as low as .11 would end up with an individual predicted category of '1', which would be quite extreme by any measure.

Vagelis


From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 9:36 PM
Subject: RE: Problem with Binary Logistic regression.

The classification table predicts a value of 1 when the individual estimated probability is above a critical level. The default critical level is 0.5, although users may choose some other critical level (something not advisable: see below). Perhaps you were using different critical levels to “predict” individual outcomes. Perhaps inadvertently the two procedures used different critical values for prediction of the outcome in individual cases.
 
Now, why using other such critical values may be misleading?
 
Predicting “1” when the probability is over 0.5, and “0” when it is below, seems a reasonable choice when the average probability for a randomly chosen individual is around 0.5 (say, within 0.4-0.6), but not when the average probability is more extreme (close to zero or close to one). If the overall probability is, say, 0.8, individuals probabilities will be probably about that average, and perhaps most individuals would be above 0.5 as well; in such situation, SPSS would (wrongly) predict the event (p>0.5) for nearly everyone, when in fact the event did not occur in many cases. The opposite would happen with a low p, say p=0.10: individual probabilities would be clustered around 0.10 and few (if any) would be above 0.5, and therefore SPSS would (wrongly) “predict” that nobody gets the event (zero for nearly everyone).
 
Instead of using a default critical level of 0.5, one may be tempted some “more realistic” critical value. The obvious choice would be using the average probability (say 0.8 or 0.10) as the critical value, but this also leads to error or to paradoxical results. Suppose the average probability is 0.1 and this value is used as the threshold for predicting the event; everyone with p>0.10 would be predicted to get the outcome. Then individuals with probabilities as low as 0.11 would get a “predicted” value of 1 (because they are above the critical value 0.1) in spite of overwhelming odds (89 to 11) that the event would not happen to them. In the opposite case, with p=0.8, someone with p=0.79 would be predicted not to undergo the event, although in fact the occurrence of the event is the most likely result (odds are 79:21). Now, come to think of it, a threshold of 0.5 is ALSO potentially misleading, although more neutrally positioned, when you are trying to predict individual outcomes.
 
Hector
 
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 16:01
Para: [hidden email]
Asunto: Re: Problem with Binary Logistic regression.
 
That makes sense. The two procedures are identical as far as the variables go. Actually, I tried virtually every option available in the binary logistic regression dialog box, but couldn't find a way to rectify this. Is there some way to find out what goes wrong in the binary logistic aggression?
 

From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 8:45 PM
Subject: RE: Problem with Binary Logistic regression.
If the dependent and independent variables were the same in both cases (and the zero was missing in the dependent variable in both cases), the two procedures are the same and should not differ. However, for binary dependents the adequate tool is (binary) logistic regression.
 
De: Vagelis S. [mailto:[hidden email]]
Enviado el: Saturday, October 01, 2011 15:40
Para: Hector Maletta; [hidden email]
Asunto: Re: Problem with Binary Logistic regression.
 
Thank you for your quick and informative reply.
I think that I will be able to continue without major problems.
The only thing is, that when I performed a multinomial logistic regression, for some reason the 'predicted values' for the group membership end up being almost identical to the values of the variable for each case (perhaps (90% identical), while in binary it's all 0s. That led me to suspect that there was something wrong with the regression procedure. 
 
Vagelis S.
 

From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 7:50 PM
Subject: RE: Problem with Binary Logistic regression.
In addition to my previous comment:
 
If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).
The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.
 
Hector
 
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.
 
Hello, this is a newbie question, so please bear with me.

I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.
 
Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
 

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
 

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11


Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Bruce Weaver
Administrator
In reply to this post by jaern
Perhaps you missed my earlier suggestion that you post your syntax for both LOGISTIC REGRESSION and NOMREG.  Having the syntax will give people a fighting chance of diagnosing the problem.



jaern wrote
That makes sense. The two procedures are identical as far as the variables go. Actually, I tried virtually every option available in the binary logistic regression dialog box, but couldn't find a way to rectify this. Is there some way to find out what goes
wrong in the binary logistic aggression?


________________________________
From: Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 8:45 PM
Subject: RE: Problem with Binary Logistic regression.


If the dependent and independent variables were the same in both cases (and the zero was missing in the dependent variable in both cases), the two procedures are the same and should not differ. However, for binary dependents the adequate tool is (binary) logistic regression.
 
De:Vagelis S. [mailto:[hidden email]]
Enviado el: Saturday, October 01, 2011 15:40
Para: Hector Maletta; [hidden email]
Asunto: Re: Problem with Binary Logistic regression.
 
Thank you for your quick and informative reply.
I think that I will be able to continue without major problems.
The only thing is, that when I performed a multinomial logistic regression, for some reason the 'predicted values' for the group membership end up being almost identical to the values of the variable for each case (perhaps (90% identical), while in binary it's all 0s. That led me to suspect that there was something wrong with the regression procedure. 
 
Vagelis S.
 

________________________________

From:Hector Maletta <[hidden email]>
To: 'Vagelis S.' <[hidden email]>; [hidden email]
Sent: Saturday, October 1, 2011 7:50 PM
Subject: RE: Problem with Binary Logistic regression.
In addition to my previous comment:
 
If your dependent variable is binary, it does not matter whether it is coded (1,2) or (0,1) or whatever, since the two values are treated as two separate outcomes. The missing value is probably best to be left alone: it means no information is available for a particular case; in most situations such cases should be excluded from the analysis. If, on the contrary, the 0 value is to be considered as one of the possible outcomes, then your option is to use multinomial logistic regression, which will give you the probability that each individual falls within each of the categories (say 0, 1 or 2).
The categorical variables among your predictors may have several categories each, and in that case they are routinely converted into a series of dummies during the procedure. One of the categories of each variable is taken as the reference category (you may choose which) and the effect of the other categories is measured against the effect of the reference category.
 
Hector
 
De:SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Vagelis S.
Enviado el: Saturday, October 01, 2011 13:50
Para: [hidden email]
Asunto: Problem with Binary Logistic regression.
 
Hello, this is a newbie question, so please bear with me.

I have a dataset and I need to perform a binary logistic regression.
The dependent variable is nominal, (1,2) but has also an unknown value (0). In the define data properties window, the unknown value is defined as missing.
I need to perform a logical regr. with 2 scale and 1 nominal variable as covariates. I defined the nominal as categorical in the logistic regression dialog box.

Now, the problem is, whatever I do --change the method, enter only one covariate, etc, all the predicted values turn out as 1s... ie. the model does not compute properly.

I tried to delete the missing value (0) incase that was the problem, but I get the same thing. Only if I perform a multinomial regression, the results end up more 'normal' let's say. I also tried to recode the dependent variable into one that has (0,1), but still the same thing is happening.
 
Now, is there something I can do, like a test or anything? Something to check the integrity of my dataset? Or is it a hidden choice in some sub-menu which eludes me?

Thanks in advance!

________________________________

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
 

________________________________

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Problem with Binary Logistic regression.

Ryan
Agreed. The OP should post the syntax. With that said, BELOW is some
syntax that generates data from a logistic regression equation and
then fits the same model on the simulated data using three procedures:
LOGISTIC, NOMREG, and GENLIN. All three procedures seem to produce
identical results.

HTH,

Ryan

--

*Generate Data.

set seed 98765432.
new file.

inp pro.

loop ID= 1 to 10000.

    comp FactorA = rv.bernoulli(0.5).
    comp FactorB = rv.bernoulli(0.5).
    comp b0 = -1.5.
    comp b1 = 0.9.
    comp b2 = 0.5.
    comp b3 = 1.2.
    comp eta  = b0 + b1*FactorA + b2*FactorB + b3*FactorA*FactorB.
    comp prob = exp(eta) / (1+ exp(eta)).

    comp y = rv.bernoulli(prob).

    end case.
  end loop.
end file.
end inp pro.
exe.

Delete variables b0 b1 b2 b3 eta prob.

LOGISTIC REGRESSION VARIABLES y
  /METHOD=ENTER FactorA FactorB FactorA*FactorB
  /PRINT=CI(95)
  /SAVE=PRED PGROUP.

NOMREG y (BASE=FIRST ORDER=ASCENDING) WITH FactorA FactorB
  /MODEL=FactorA FactorB FactorA*FactorB
  /INTERCEPT=INCLUDE
  /PRINT=PARAMETER SUMMARY LRT CPS STEP MFI
  /SAVE ESTPROB PREDCAT.

GENLIN y (REFERENCE=FIRST) WITH FactorA FactorB
  /MODEL FactorA FactorB FactorA*FactorB INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOGIT
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED)
  /SAVE MEANPRED PREDVAL.

On Sat, Oct 1, 2011 at 6:08 PM, Bruce Weaver <[hidden email]> wrote:

> Perhaps you missed my earlier suggestion that you post your syntax for both
> LOGISTIC REGRESSION and NOMREG.  Having the syntax will give people a
> fighting chance of diagnosing the problem.
>
>
>
>
> jaern wrote:
>>
>> That makes sense. The two procedures are identical as far as the variables
>> go. Actually, I tried virtually every option available in the binary
>> logistic regression dialog box, but couldn't find a way to rectify this.
>> Is there some way to find out what goes
>> wrong in the binary logistic aggression?
>>
>>
>> ________________________________
>> From: Hector Maletta <hmaletta@.com>
>> To: 'Vagelis S.' <visqom@>; SPSSX-L@.UGA
>> Sent: Saturday, October 1, 2011 8:45 PM
>> Subject: RE: Problem with Binary Logistic regression.
>>
>>
>> If the dependent and independent variables were the same in both cases
>> (and the zero was missing in the dependent variable in both cases), the
>> two procedures are the same and should not differ. However, for binary
>> dependents the adequate tool is (binary) logistic regression.
>> Â
>> De:Vagelis S. [mailto:visqom@]
>> Enviado el: Saturday, October 01, 2011 15:40
>> Para: Hector Maletta; SPSSX-L@.UGA
>> Asunto: Re: Problem with Binary Logistic regression.
>> Â
>> Thank you for your quick and informative reply.
>> I think that I will be able to continue without major problems.
>> The only thing is, that when I performed a multinomial logistic
>> regression, for some reason the 'predicted values' for the group
>> membership end up being almost identical to the values of the variable for
>> each case (perhaps (90% identical), while in binary it's all 0s. That led
>> me to suspect that there was something wrong with the regression
>> procedure.Â
>> Â
>> Vagelis S.
>> Â
>>
>> ________________________________
>>
>> From:Hector Maletta <hmaletta@.com>
>> To: 'Vagelis S.' <visqom@>; SPSSX-L@.UGA
>> Sent: Saturday, October 1, 2011 7:50 PM
>> Subject: RE: Problem with Binary Logistic regression.
>> In addition to my previous comment:
>> Â
>> If your dependent variable is binary, it does not matter whether it is
>> coded (1,2) or (0,1) or whatever, since the two values are treated as two
>> separate outcomes. The missing value is probably best to be left alone: it
>> means no information is available for a particular case; in most
>> situations such cases should be excluded from the analysis. If, on the
>> contrary, the 0 value is to be considered as one of the possible outcomes,
>> then your option is to use multinomial logistic regression, which will
>> give you the probability that each individual falls within each of the
>> categories (say 0, 1 or 2).
>> The categorical variables among your predictors may have several
>> categories each, and in that case they are routinely converted into a
>> series of dummies during the procedure. One of the categories of each
>> variable is taken as the reference category (you may choose which) and the
>> effect of the other categories is measured against the effect of the
>> reference category.
>> Â
>> Hector
>> Â
>> De:SPSSX(r) Discussion [mailto:SPSSX-L@.UGA] En nombre de Vagelis S.
>> Enviado el: Saturday, October 01, 2011 13:50
>> Para: SPSSX-L@.UGA
>> Asunto: Problem with Binary Logistic regression.
>> Â
>> Hello, this is a newbie question, so please bear with me.
>>
>> I have a dataset and I need to perform a binary logistic regression.
>> The dependent variable is nominal, (1,2) but has also an unknown value
>> (0). In the define data properties window, the unknown value is defined as
>> missing.
>> I need to perform a logical regr. with 2 scale and 1 nominal variable as
>> covariates. I defined the nominal as categorical in the logistic
>> regression dialog box.
>>
>> Now, the problem is, whatever I do --change the method, enter only one
>> covariate, etc, all the predicted values turn out as 1s... ie. the model
>> does not compute properly.
>>
>> I tried to delete the missing value (0) incase that was the problem, but I
>> get the same thing. Only if I perform a multinomial regression, the
>> results end up more 'normal' let's say. I also tried to recode the
>> dependent variable into one that has (0,1), but still the same thing is
>> happening.
>> Â
>> Now, is there something I can do, like a test or anything? Something to
>> check the integrity of my dataset? Or is it a hidden choice in some
>> sub-menu which eludes me?
>>
>> Thanks in advance!
>>
>> ________________________________
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
>> Â
>>
>> ________________________________
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1410 / Virus Database: 1520/3931 - Release Date: 10/01/11
>>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Problem-with-Binary-Logistic-regression-tp4860087p4860800.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD