GENLIN – Binary logistic model

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

GENLIN – Binary logistic model

Firas Asad
Hello everyone,

I am trying to examine the effect of people’s annual income (income) on the model of the laptop computer (laptop) they bought within the last year.

I utilised the GENLIN procedure with a binary logistic model whereby the response is the model of the laptop (PC vs. Mac) and the predictor is the annual income (in scale level). The reference category is the (PC). I used the default values of the all other options under the tabs of the GENLIN main window.

My sample size is 329; while 182 people bought a PC, 147 bought a Mac. Regarding the annual income, it ranges from £5300 to £78230 with a mean of £19560.            

In order to get an initial clue, I run the t-test and it showed that there is a significant difference in the mean annual income between the two groups of buyers. Those who bought Macs were with higher income than those who bought PCs.    

Moving on to the next step, I run the GNLIN procedure. The results were quite strange and confusing for me. The odds ratio (exp(B)) associated with the income variable is (1.00) and with significant Wald Chi-square (p-value = 0.00) . According to my understanding, this means there is no any impact of the income on the odds of buying a specific computer type. This contradicts the t-test results.        

I tried two things as an attempt to get this issue sorted out. First, I categorised the originally continuous income variable into 5 categories. Re-running the GENLIN gave quite logical and often significant results. The odds ratios that people have bought a Mac increase with the increase of their income. For example, the highest income category is associated with odds ratio of about 134.

Second, I divided the original quantitative income variable by 1000 ( I just thought this may help). Re-running the GENLIN gave slightly better results than the first run (where OR was 1.00) run. The odds ratio of the income variable is now significant with a value of 1.130.   

Now, having stated that, my main and first question is why transforming the income variable from continuous to multi-categorical has substantially altered the results. The second relevant question is why dividing the income by 1000 has slightly changed the results as well.     

I am not sure if the noted issue(s) above is specific to the algorithm of the GENLIN procedure adopted in the SPSS or actually I am just missing something related to the overall philosophical /mathematical methodology of the Generalised Linear Models.
 
Any little help would be highly appreciated.

Regards and many thanks in advance,

Firas



Reply | Threaded
Open this post in threaded view
|

Re: GENLIN – Binary logistic model

Bruce Weaver
Administrator
Hello Firas.  First, it would be helpful to see your GENLIN syntax.  Second, when you have a continuous predictor variable such as income, Exp(B) gives the odds ratio associated with a one unit increase in that variable.  In other words, you were getting the odds ratio for a one Pound increase in income.  When you divided income by 1000, you got the odds ratio for a 1000 Pound increase in income.  In short, it's up to you to decide what increment in income is sensible to look at.  One pound is clearly too low.  Given that your range of incomes spans nearly 73,000, maybe the odds ratio for a 5000 Pound increase would be sensible.  

Carving continuous variables into categories is usually a bad idea, because you lose power and eat up degrees of freedom.  I think the only reason I would carve income into categories would be to do some preliminary exploratory analyses into whether the relationship is linear in the logit.  If those analyses suggested some non-linear function, then I would include appropriate polynomial terms (when using the continuous variable once again), or maybe look at using a spline.

HTH.


Firas Asad wrote
Hello everyone,

I am trying to examine the effect of
people’s annual income (income) on the model of the laptop computer (laptop) they
bought within the last year.

I utilised the GENLIN procedure with a
binary logistic model whereby the response is the model of the laptop (PC vs. Mac)
and the predictor is the annual income (in scale level). The reference category
is the (PC). I used the default values of the all other options under the tabs
of the GENLIN main window.

My sample size is 329; while 182 people
bought a PC, 147 bought a Mac. Regarding the annual income, it ranges from £5300
to £78230 with a mean of £19560.            

In order to get an initial clue, I
run the t-test and it showed that there is a significant difference in the mean
annual income between the two groups of buyers. Those who bought Macs were with
higher income than those who bought PCs.    

Moving on to the next step, I run the
GNLIN procedure. The results were quite strange and confusing for me. The odds
ratio (exp(B)) associated with the income variable is (1.00) and with
significant Wald Chi-square (p-value = 0.00) . According to my understanding, this
means there is no any impact of the income on the odds of buying a specific
computer type. This contradicts the t-test results.        

I tried two things as an attempt to
get this issue sorted out. First, I categorised the originally continuous
income variable into 5 categories. Re-running the GENLIN gave quite logical and
often significant results. The odds ratios that people have bought a Mac increase
with the increase of their income. For example, the highest income category is
associated with odds ratio of about 134.

Second, I divided the original
quantitative income variable by 1000 ( I just thought this may help).
Re-running the GENLIN gave slightly better results than the first run (where OR
was 1.00) run. The odds ratio of the income variable is now significant with a value
of 1.130.   

Now, having stated that, my main and
first question is why transforming the income variable from continuous to multi-categorical
has substantially altered the results. The second relevant question is why
dividing the income by 1000 has slightly changed the results as well.     

I am not sure if the noted issue(s) above
is specific to the algorithm of the GENLIN procedure adopted in the SPSS or actually
I am just missing something related to the overall philosophical /mathematical methodology
of the Generalised Linear Models.
 
Any little help would be highly appreciated.

Regards and many thanks in advance,

Firas
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Automatic reply: GENLIN – Binary logistic model

MICHAEL J TONER



Hello,

 

I will be out of the office on Friday, November 2, 2012. I will be back in the office on Monday, November 5 and will respond to your email (if necessary) then.

 

If you are an applicant to a Temple graduate program and have a question about the status of your application, you should contact the department to which you applied. They have your application file and are responsible for processing your application and making an admissions decision on it.

 

A link to the Graduate Bulletin, which has information on all of our graduate programs, is here: http://www.temple.edu/gradbulletin/alphaindex.htm

 

If you are an undergraduate applicant, someone from Undergraduate Admissions will respond to you soon.

 

If you are having an issue with our online application system or TUportal, please contact our Help Desk ([hidden email]).

 

Best,

 

Michael Toner

 

 

 

*********************************************

Michael J. Toner, Ph.D.

Associate Director, Graduate Enrollment and Data Management

Office of the Graduate School

Temple University

1803 N. Broad St.

501 Carnell Hall

Philadelphia, PA 19122-6095

phone 215.204.6577

fax 215.204.8781

email [hidden email]

web www.temple.edu/grad

 

Reply | Threaded
Open this post in threaded view
|

Re: GENLIN – Binary logistic model

Firas Asad
In reply to this post by Bruce Weaver
Dear Bruce,

Thank you so much for your quick, clear and helpful reply. 
It works; the model results sound logical now. I will consider your advise regarding re-coding continuous variable into multi-categories in the future analyses. 

Kind regards,

Firas

On Nov 1, 2012, at 21:34, Bruce Weaver <[hidden email]> wrote:

Hello Firas.  First, it would be helpful to see your GENLIN syntax.  Second,
when you have a continuous predictor variable such as income, Exp(B) gives
the odds ratio associated with a one unit increase in that variable.  In
other words, you were getting the odds ratio for a one Pound increase in
income.  When you divided income by 1000, you got the odds ratio for a 1000
Pound increase in income.  In short, it's up to you to decide what increment
in income is sensible to look at.  One pound is clearly too low.  Given that
your range of incomes spans nearly 73,000, maybe the odds ratio for a 5000
Pound increase would be sensible.

Carving continuous variables into categories is usually a bad idea, because
you lose power and eat up degrees of freedom.  I think the only reason I
would carve income into categories would be to do some preliminary
exploratory analyses into whether the relationship is linear in the logit.
If those analyses suggested some non-linear function, then I would include
appropriate polynomial terms (when using the continuous variable once
again), or maybe look at using a spline.

HTH.




Hello everyone,

I am trying to examine the effect of
people’s annual income (income) on the model of the laptop computer
(laptop) they
bought within the last year.

I utilised the GENLIN procedure with a
binary logistic model whereby the response is the model of the laptop (PC
vs. Mac)
and the predictor is the annual income (in scale level). The reference
category
is the (PC). I used the default values of the all other options under the
tabs
of the GENLIN main window.

My sample size is 329; while 182 people
bought a PC, 147 bought a Mac. Regarding the annual income, it ranges from
£5300
to £78230 with a mean of £19560.           Â

In order to get an initial clue, I
run the t-test and it showed that there is a significant difference in the
mean
annual income between the two groups of buyers. Those who bought Macs were
with
higher income than those who bought PCs. Â Â Â

Moving on to the next step, I run the
GNLIN procedure. The results were quite strange and confusing for me. The
odds
ratio (exp(B)) associated with the income variable is (1.00) and with
significant Wald Chi-square (p-value = 0.00) . According to my
understanding, this
means there is no any impact of the income on the odds of buying a
specific
computer type. This contradicts the t-test results. Â Â Â Â Â Â Â

I tried two things as an attempt to
get this issue sorted out. First, I categorised the originally continuous
income variable into 5 categories. Re-running the GENLIN gave quite
logical and
often significant results. The odds ratios that people have bought a Mac
increase
with the increase of their income. For example, the highest income
category is
associated with odds ratio of about 134.

Second, I divided the original
quantitative income variable by 1000 ( I just thought this may help).
Re-running the GENLIN gave slightly better results than the first run
(where OR
was 1.00) run. The odds ratio of the income variable is now significant
with a value
of 1.130. Â Â

Now, having stated that, my main and
first question is why transforming the income variable from continuous to
multi-categorical
has substantially altered the results. The second relevant question is why
dividing the income by 1000 has slightly changed the results as well. Â Â Â Â

I am not sure if the noted issue(s) above
is specific to the algorithm of the GENLIN procedure adopted in the SPSS
or actually
I am just missing something related to the overall philosophical
/mathematical methodology
of the Generalised Linear Models.
Â
Any little help would be highly appreciated.

Regards and many thanks in advance,

Firas





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/GENLIN-Binary-logistic-model-tp5716008p5716010.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Automatic reply: GENLIN – Binary logistic model

Fuller, Matthew
I will be out of the office until November 5th, with limited access to e-mail.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: GENLIN – Binary logistic model

Richard Ristow
In reply to this post by Bruce Weaver
At 05:34 PM 11/1/2012, Bruce Weaver wrote:

>The only reason I would carve income into categories would be for
>preliminary analyses into whether the relationship is linear in the
>logit. If those analyses suggested some non-linear function, then I
>would include polynomial terms [in the continuous variable], or
>maybe look at using a spline.

You might, in particular, try a log transformation of income. Those
are often appropriate first, when the outcome is the result of human
perception; and second, when the ratio of largest to smallest
observed values is large (in this case, something over an order of magnitude).

Then, you're looking for the effect of a certain *fractional* change
in income. What scale should you use? I'd consider one where a unit
change in the scale corresponds to a ratio of 1.2, or 20% change in
income -- which is also about the 4th root of a doubling of income.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Automatic reply: GENLIN – Binary logistic model

Dittfurth, Monica

I will be out of the office on Friday, November 2nd. If you have an urgent need or request before I return, please email [hidden email].

 

Thanks,

Monica

 

Reply | Threaded
Open this post in threaded view
|

RE: GENLIN – Binary logistic model

Rich Ulrich
In reply to this post by Firas Asad
An important thing to emphasize here is that you have read the output wrong.

The "significant Wald Chi-square (p-value = 0.00)" indicates that there *is* a large
effect for the prediction.  So the results do *not* contradict the t-test results.

Why is the OR 1.00?  That was explained by another poster -- It only looks
like "1.00" because of round-off error when your unit of money is 1.0.  It
will be a bigger value if you restate income in 1000s or 5000s, as he suggested.
I don't remember how clear he was about that implication.


--
Rich Ulrich


Date: Thu, 1 Nov 2012 13:30:21 -0700
From: [hidden email]
Subject: GENLIN – Binary logistic model
To: [hidden email]

Hello everyone,
[...snip]

Moving on to the next step, I run the GNLIN procedure. The results were quite strange and confusing for me. The odds ratio (exp(B)) associated with the income variable is (1.00) and with significant Wald Chi-square (p-value = 0.00) . According to my understanding, this means there is no any impact of the income on the odds of buying a specific computer type. This contradicts the t-test results.        

[snip rest]