SPSSX Discussion - Re: Multinomial Logistic Regression

Re: Multinomial Logistic Regression - Category Size

Posted by Ryan on Dec 02, 2010; 4:33am
URL: http://spssx-discussion.165.s1.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3288876.html

Bruce et al.,

I have little doubt that the parameter estimates obtained from a
generalized logits multinomial regression without any predictors yield
log(relative risks), and assuming predictors are in the model,
relative risk ratios. Allow me to provide a couple simple examples
(without a predictor and with a dichotomous predictor) here to provide
evidence in support of what I've stated. But before I do, let's make
sure we all agree on some basic definitions within a logistic
regression framework:

Risk_A = probability of event A
Risk_B = probability of event B
Risk_C = probability of event C

Relative Risk_A_B = Risk A / Risk B
Relative Risk_A_C = Risk A / Risk C
Relative Risk_B_C = Risk B / Risk C

Odds_A = probability of event A / probability of not event A
Odds_B = probability of event B / probability of not event B
Odds_C = probability of event C / probability of not event C

Odds Ratio_A_B = Odds_A / Odds_B
Odds Ratio_A_C = Odds_A / Odds_C
Odds Ratio_B_C = Odds_B / Odds_C

Now, the first example I provide below shows that the parameter
estimates obtained from the generalized logits multinomial regression
model with no predictors below are equivalent to log(Relative Risks).

data list list / Y.
begin data
1
1
1
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
end data.

FREQUENCIES VARIABLES=Y
/ORDER=ANALYSIS.

COMPUTE RR_1_3_raw = 16.666666666666664 / 50.
COMPUTE RR_2_3_raw = 33.33333333333333 / 50.
EXECUTE.

NOMREG Y (BASE = LAST ORDER=ASCENDING)
/MODEL
/INTERCEPT=INCLUDE
/PRINT=PARAMETER .

COMPUTE RR_1_3 = exp(-1.0986122886681096).
COMPUTE RR_2_3 = exp(-0.4054651081081645).
EXECUTE.

It should be clear from the example above that the parameter estimates
can certainly be interpreted as log(Relative Risks). Those calculated
from CROSSTABS using the definitional formulas are exactly the same as
those output from NOMREG. Now, let's add a dichotomous predictor to
the model to see what happens. I provide further comments after this
code.

data list list / Y X.
begin data
1 1
1 1
1 0
2 1
2 1
2 1
2 0
2 1
2 1
3 1
3 1
3 0
3 1
3 1
3 1
3 0
3 1
3 1
end data.

CROSSTABS
/TABLES=Y BY X
/FORMAT=AVALUE TABLES
/CELLS=COUNT ROW COLUMN
/COUNT ROUND CELL.

COMPUTE RR_1_3_X0_Raw = (25 / 50) .
COMPUTE RR_1_3_X1_Raw = (14.285714285714285 / 50).
COMPUTE RRR_1_3_Raw = RR_1_3_X1_Raw / RR_1_3_X0_Raw.
EXECUTE.

COMPUTE RR_2_3_X0_Raw = (25 / 50) .
COMPUTE RR_2_3_X1_Raw = (35.714285714285715 / 50).
COMPUTE RRR_2_3_Raw = RR_2_3_X1_Raw / RR_2_3_X0_Raw.
EXECUTE.

NOMREG Y (BASE = LAST ORDER=ASCENDING) WITH X
/MODEL X
/INTERCEPT=INCLUDE
/PRINT=PARAMETER .

COMPUTE RRR_1_3 = exp(-0.5596157879354635).
COMPUTE RRR_2_3 = exp(0.35667494393875165).
EXECUTE.

Again, I calculated the estimates using the probability estimates from
CROSSTABS. Then I compared those estimates to exponentiated estimates
from NOMREG. As expected, they [relative risk ratios] are identical.

Ryan

On Wed, Dec 1, 2010 at 10:07 PM, Bruce Weaver <[hidden email]> wrote:

> I responded to Ryan off-list to ask if he meant to say that the parameter
> estimates are interpreted as the change in the log(odds) relative to a
> reference category. � He responded that he did indeed mean log(risk), not
> log(odds); and we have been having a vigorous back and forth discussion
> since, exchanging examples and links. � Here is one link I sent to Ryan:
>
> � http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm#estimates
>
> And here's one he just sent to me (which I've not read yet--it's bed time
> here).
>
> � http://www.columbia.edu/~so33/SusDev/Lecture_10.pdf
>
> Just thought I'd post this, in case anyone else was interested. � I may have
> some more to say after reading that last document.
>
> Cheers,
> Bruce
>
>
> R B wrote:
>>
>> Stefan,
>>
>> What do you mean by the following statement?: "...if the overall model
>> is significant, I can conclude that there is a significant
>> relationship between the dependent and independent variable for all
>> categories of the dependent variable?"
>>
>> In the typical multinomial logistic regression assuming a single
>> continuous predictor, X, the parameter estimates are interpreted as
>> the change in the log(risk relative to the reference category), given
>> a one-point increase in X. The parameter estimates do NOT reflect a
>> change in the log(risk) of observing each category, given a one-point
>> increase in X. Moreover, it's certainly possible to observe a
>> non-significant log(relative risk) in the presence an overall
>> significant model effect.
>>
>> Ryan
>>
>> On Tue, Nov 30, 2010 at 7:49 AM, s-volk <[hidden email]>
>> wrote:
>>> Dear all,
>>>
>>> I ran a multinomial logistic regression analysis with one continuous
>>> independent variable. I have a sample size of 68 subjects (psychological
>>> experiment) which end up split into 5 categories ranging in size from 5
>>> to
>>> 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and
>>> Model
>>> Fit χ2=19.71, p<0.01.
>>>
>>> Now here is the problem: A reviewer complains that my results may be
>>> sample
>>> specific because one of the 5 categories of the dependent variable
>>> consists
>>> of only 5 observations (subjects), i.e. sh/e argues that very few
>>> participants (five) are responsible for the observed effects. Is this
>>> valid
>>> argument? I thought that if the overall model is significant, I can
>>> conclude
>>> that there is a significant relationship between the dependent and
>>> independent variable for all categories of the dependent variable? That
>>> is,
>>> the calculations for the overall model are based on all observations (68)
>>> and not only on the observations in specific categories (e.g., 5)?
>>>
>>> I was wondering if someone could provide me with or point me to some
>>> arguments for reviewers (ideally including some references)?
>>>
>>> Many thanks in advance,
>>> Stefan
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3286013.html
>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> [hidden email] (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3288831.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD