Login  Register

Re: Multinomial Logistic Regression - Category Size

Posted by Bruce Weaver on Dec 02, 2010; 6:55pm
URL: http://spssx-discussion.165.s1.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3289841.html

Fair enough, Ryan.  I've still not had time to look at those lecture notes you gave a link for, but will try to get to them sometime soon.  Meanwhile, I'll keep calling it an odds  ratio.  ;-)

Cheers,
Bruce


R B wrote
Bruce,

I decided to calculate what you call "First OR" using the RRR formula
I presented previously. Lo and behold, our estimates are the same.

RRR1 = ((6 / (6 + 8 + 38)) / (38 / (6 + 8 + 38))) / ((51 / (51 + 42 +
518)) / (518 / (51 + 42 + 518)))
          = 1.6037151703

You can see in the formula above that I am calculating two relative
risk estimates and then dividing those two estimates. I am comfortable
interpreting this estimate as a relative risk ratio because of the way
I set up the equation. Interpretation of this estimate as a RRR is
common IMO. Having said that, I have also seen places where this
estimate is interpreted as an odds ratio.

Best wishes,

Ryan

On Thu, Dec 2, 2010 at 11:37 AM, Bruce Weaver <bruce.weaver@hotmail.com> wrote:
> Here is an example demonstrating equivalence of Exp(B) from NOMREG with odds
> ratios computed via CROSSTABS.
>
> Cheers,
> Bruce
>
> * Multinomial logistic regression on a 4x3 table.
> * The data are from
> http://www.angelfire.com/wv/bwhomedir/notes/multinomial_log_reg.pdf .
>
> data list list / X Y kount (3f5.0).
> begin data
> 1 1 6
> 1 2 8
> 1 3 38
> 2 1 13
> 2 2 29
> 2 3 55
> 3 1 20
> 3 2 33
> 3 3 160
> 4 1 51
> 4 2 42
> 4 3 518
> end data.
>
> var lab
> � x 'Functional status'
> � y 'ICU Code Status'
> .
> val lab
> � x 1 'Unknown'
> �  2 'Severely limited'
> �  3 'Somewhat limited'
> �  4 'Totally independent' /
> � y 1 'Explicit: Resuscitate'
> �  2 'Explicit: DNR'
> �  3 'Implicit: Resuscitate'
> .
>
> weight by kount.
> crosstabs x by y .
>
> NOMREG Y (BASE=LAST ORDER=ASCENDING) BY X
> � /MODEL = X
> � /INTERCEPT=INCLUDE
> � /PRINT=PARAMETER SUMMARY LRT STEP MFI.
>
> * Notice that the Exp(B) values from this model match exactly
> * the odds ratios computed in the document.
>
> * Now compute the same odds ratios via CROSSTABS.
>
> * First OR.
> temporary.
> select if any(X,1,4) and any(Y,1,3).
> crosstabs x by y / stat = risk.
>
> * Second OR.
> temporary.
> select if any(X,2,4) and any(Y,1,3).
> crosstabs x by y / stat = risk.
>
> * Third OR.
> temporary.
> select if any(X,3,4) and any(Y,1,3).
> crosstabs x by y / stat = risk.
>
> * Fourth OR.
> temporary.
> select if any(X,1,4) and any(Y,2,3).
> crosstabs x by y / stat = risk.
>
> * Fifth OR.
> temporary.
> select if any(X,2,4) and any(Y,2,3).
> crosstabs x by y / stat = risk.
>
> * Sixth OR.
> temporary.
> select if any(X,3,4) and any(Y,2,3).
> crosstabs x by y / stat = risk.
>
> * Notice that the odds ratios & 95% confidence intervals
> * obtained via CROSSTABS match exactly the values of
> * Exp(B) from NOMREG and their 95% confidence intervals.
>
>
>
>
> R B wrote:
>>
>> Bruce et al.,
>>
>> I have little doubt that the parameter estimates obtained from a
>> generalized logits multinomial regression without any predictors yield
>> log(relative risks), and assuming predictors are in the model,
>> relative risk ratios. Allow me to provide a couple simple examples
>> (without a predictor and with a dichotomous predictor) here to provide
>> evidence in support of what I've stated. But before I do, let's make
>> sure we all agree on some basic definitions within a logistic
>> regression framework:
>>
>> Risk_A = probability of event A
>> Risk_B = probability of event B
>> Risk_C = probability of event C
>>
>> Relative Risk_A_B = Risk A / Risk B
>> Relative Risk_A_C = Risk A / Risk C
>> Relative Risk_B_C = Risk B / Risk C
>>
>> Odds_A = probability of event A / probability of not event A
>> Odds_B = probability of event B / probability of not event B
>> Odds_C = probability of event C / probability of not event C
>>
>> Odds Ratio_A_B = Odds_A / Odds_B
>> Odds Ratio_A_C = Odds_A / Odds_C
>> Odds Ratio_B_C = Odds_B / Odds_C
>>
>> Now, the first example I provide below shows that the parameter
>> estimates obtained from the generalized logits multinomial regression
>> model with no predictors below are equivalent to log(Relative Risks).
>>
>> data list list / Y.
>> begin data
>> 1
>> 1
>> 1
>> 2
>> 2
>> 2
>> 2
>> 2
>> 2
>> 3
>> 3
>> 3
>> 3
>> 3
>> 3
>> 3
>> 3
>> 3
>> end data.
>>
>> FREQUENCIES VARIABLES=Y
>> /ORDER=ANALYSIS.
>>
>> COMPUTE RR_1_3_raw = 16.666666666666664 / 50.
>> COMPUTE RR_2_3_raw = 33.33333333333333 � / 50.
>> EXECUTE.
>>
>> NOMREG Y (BASE = LAST ORDER=ASCENDING)
>> /MODEL
>> /INTERCEPT=INCLUDE
>> /PRINT=PARAMETER .
>>
>> COMPUTE RR_1_3 = exp(-1.0986122886681096).
>> COMPUTE RR_2_3 = exp(-0.4054651081081645).
>> EXECUTE.
>>
>> It should be clear from the example above that the parameter estimates
>> can certainly be interpreted as log(Relative Risks). Those calculated
>> from CROSSTABS using the definitional formulas are exactly the same as
>> those output from NOMREG. Now, let's add a dichotomous predictor to
>> the model to see what happens. I provide further comments after this
>> code.
>>
>> data list list / Y X.
>> begin data
>> 1 1
>> 1 1
>> 1 0
>> 2 1
>> 2 1
>> 2 1
>> 2 0
>> 2 1
>> 2 1
>> 3 1
>> 3 1
>> 3 0
>> 3 1
>> 3 1
>> 3 1
>> 3 0
>> 3 1
>> 3 1
>> end data.
>>
>> CROSSTABS
>> �  /TABLES=Y BY X
>> �  /FORMAT=AVALUE TABLES
>> �  /CELLS=COUNT ROW COLUMN
>> �  /COUNT ROUND CELL.
>>
>> COMPUTE RR_1_3_X0_Raw = (25 / 50) .
>> COMPUTE RR_1_3_X1_Raw = (14.285714285714285 / 50).
>> COMPUTE RRR_1_3_Raw = RR_1_3_X1_Raw / RR_1_3_X0_Raw.
>> EXECUTE.
>>
>> COMPUTE RR_2_3_X0_Raw = (25 / 50) .
>> COMPUTE RR_2_3_X1_Raw = (35.714285714285715 / 50).
>> COMPUTE RRR_2_3_Raw = RR_2_3_X1_Raw / RR_2_3_X0_Raw.
>> EXECUTE.
>>
>> NOMREG Y (BASE = LAST ORDER=ASCENDING) WITH X
>> /MODEL X
>> /INTERCEPT=INCLUDE
>> /PRINT=PARAMETER .
>>
>> COMPUTE RRR_1_3 = exp(-0.5596157879354635).
>> COMPUTE RRR_2_3 = exp(0.35667494393875165).
>> EXECUTE.
>>
>> Again, I calculated the estimates using the probability estimates from
>> CROSSTABS. Then I compared those estimates to exponentiated estimates
>> from NOMREG. As expected, they [relative risk ratios] are identical.
>>
>> Ryan
>>
>> On Wed, Dec 1, 2010 at 10:07 PM, Bruce Weaver <bruce.weaver@hotmail.com>
>> wrote:
>>> I responded to Ryan off-list to ask if he meant to say that the parameter
>>> estimates are interpreted as the change in the log(odds) relative to a
>>> reference category. � He responded that he did indeed mean log(risk), not
>>> log(odds); and we have been having a vigorous back and forth discussion
>>> since, exchanging examples and links. � Here is one link I sent to Ryan:
>>>
>>> � � http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm#estimates
>>>
>>> And here's one he just sent to me (which I've not read yet--it's bed time
>>> here).
>>>
>>> � � http://www.columbia.edu/~so33/SusDev/Lecture_10.pdf
>>>
>>> Just thought I'd post this, in case anyone else was interested. � I may
>>> have
>>> some more to say after reading that last document.
>>>
>>> Cheers,
>>> Bruce
>>>
>>>
>>> R B wrote:
>>>>
>>>> Stefan,
>>>>
>>>> What do you mean by the following statement?: "...if the overall model
>>>> is significant, I can conclude that there is a significant
>>>> relationship between the dependent and independent variable for all
>>>> categories of the dependent variable?"
>>>>
>>>> In the typical multinomial logistic regression assuming a single
>>>> continuous predictor, X, the parameter estimates are interpreted as
>>>> the change in the log(risk relative to the reference category), given
>>>> a one-point increase in X. The parameter estimates do NOT reflect a
>>>> change in the log(risk) of observing each category, given a one-point
>>>> increase in X. Moreover, it's certainly possible to observe a
>>>> non-significant log(relative risk) in the presence an overall
>>>> significant model effect.
>>>>
>>>> Ryan
>>>>
>>>> On Tue, Nov 30, 2010 at 7:49 AM, s-volk <stefan.volk@uni-tuebingen.de>
>>>> wrote:
>>>>> Dear all,
>>>>>
>>>>> I ran a multinomial logistic regression analysis with one continuous
>>>>> independent variable. I have a sample size of 68 subjects
>>>>> (psychological
>>>>> experiment) which end up split into 5 categories ranging in size from 5
>>>>> to
>>>>> 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and
>>>>> Model
>>>>> Fit χ2=19.71, p<0.01.
>>>>>
>>>>> Now here is the problem: A reviewer complains that my results may be
>>>>> sample
>>>>> specific because one of the 5 categories of the dependent variable
>>>>> consists
>>>>> of only 5 observations (subjects), i.e. sh/e argues that very few
>>>>> participants (five) are responsible for the observed effects. Is this
>>>>> valid
>>>>> argument? I thought that if the overall model is significant, I can
>>>>> conclude
>>>>> that there is a significant relationship between the dependent and
>>>>> independent variable for all categories of the dependent variable? That
>>>>> is,
>>>>> the calculations for the overall model are based on all observations
>>>>> (68)
>>>>> and not only on the observations in specific categories (e.g., 5)?
>>>>>
>>>>> I was wondering if someone could provide me with or point me to some
>>>>> arguments for reviewers (ideally including some references)?
>>>>>
>>>>> Many thanks in advance,
>>>>> Stefan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3286013.html
>>>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>>>
>>>>> =====================
>>>>> To manage your subscription to SPSSX-L, send a message to
>>>>> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except
>>>>> the
>>>>> command. To leave the list, send the command
>>>>> SIGNOFF SPSSX-L
>>>>> For a list of commands to manage subscriptions, send the command
>>>>> INFO REFCARD
>>>>>
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
>>>>
>>>>
>>>
>>>
>>> -----
>>> --
>>> Bruce Weaver
>>> bweaver@lakeheadu.ca
>>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>>
>>> "When all else fails, RTFM."
>>>
>>> NOTE: My Hotmail account is not monitored regularly.
>>> To send me an e-mail, please use the address shown above.
>>>
>>> --
>>> View this message in context:
>>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3288831.html
>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>
>
> -----
> --
> Bruce Weaver
> bweaver@lakeheadu.ca
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3289655.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).