Dear all,
I ran a multinomial logistic regression analysis with one continuous independent variable. I have a sample size of 68 subjects (psychological experiment) which end up split into 5 categories ranging in size from 5 to 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and Model Fit χ2=19.71, p<0.01. Now here is the problem: A reviewer complains that my results may be sample specific because one of the 5 categories of the dependent variable consists of only 5 observations (subjects), i.e. sh/e argues that very few participants (five) are responsible for the observed effects. Is this valid argument? I thought that if the overall model is significant, I can conclude that there is a significant relationship between the dependent and independent variable for all categories of the dependent variable? That is, the calculations for the overall model are based on all observations (68) and not only on the observations in specific categories (e.g., 5)? I was wondering if someone could provide me with or point me to some arguments for reviewers (ideally including some references)? Many thanks in advance, Stefan |
Administrator
|
I'll have a kick at this one, more to get some discussion going than to provide any definitive answers. ;-) I suppose the comment about it being "sample specific" translates to "will not generalize well to other samples". Just thinking out loud here, so forgive me if it ends up being twaddle. What if you ran the model again, but without the 5 potentially problematic cases. If the predicted probabilities from the two models were very similar for the other N-5 cases, this might reassure the reviewer that the omitted 5 are not overly influential. On the other hand, if the predicted probabilities differ a fair bit, that would confirm the reviewer's fears. Another possibility--could the outcome category that the 5 problem are in reasonably be merged with one of the other categories? Again, if the predicted probabilities from this model didn't differ substantially from those obtained with the original model, you could argue that the 5 cases are not very influential. Perhaps someone else will have a better idea--remember, I was just trying to prime the pump here! HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by s-volk
Stefan,
What do you mean by the following statement?: "...if the overall model is significant, I can conclude that there is a significant relationship between the dependent and independent variable for all categories of the dependent variable?" In the typical multinomial logistic regression assuming a single continuous predictor, X, the parameter estimates are interpreted as the change in the log(risk relative to the reference category), given a one-point increase in X. The parameter estimates do NOT reflect a change in the log(risk) of observing each category, given a one-point increase in X. Moreover, it's certainly possible to observe a non-significant log(relative risk) in the presence an overall significant model effect. Ryan On Tue, Nov 30, 2010 at 7:49 AM, s-volk <[hidden email]> wrote: > Dear all, > > I ran a multinomial logistic regression analysis with one continuous > independent variable. I have a sample size of 68 subjects (psychological > experiment) which end up split into 5 categories ranging in size from 5 to > 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and Model > Fit χ2=19.71, p<0.01. > > Now here is the problem: A reviewer complains that my results may be sample > specific because one of the 5 categories of the dependent variable consists > of only 5 observations (subjects), i.e. sh/e argues that very few > participants (five) are responsible for the observed effects. Is this valid > argument? I thought that if the overall model is significant, I can conclude > that there is a significant relationship between the dependent and > independent variable for all categories of the dependent variable? That is, > the calculations for the overall model are based on all observations (68) > and not only on the observations in specific categories (e.g., 5)? > > I was wondering if someone could provide me with or point me to some > arguments for reviewers (ideally including some references)? > > Many thanks in advance, > Stefan > > > > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3286013.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Bruce Weaver
Please apologize my confusing question,
What I was trying to say was: The multinomial logistic regression model (MLR) provides a likelihood-ratio test which evaluates the overall relationship between the independent variable and the dependent variable for all categories of the dependent variable. More specifically, it is tested whether the population value for the logistic regression coefficient of the independent variable is zero (i.e., there is no significant relationship between the dependent and independent variable in the population). This leads me to draw the following conclusions, which I hope are not completely wrong? 1. Since the likelihood-ratio test is based on all observations of the dependent variable, we can assume that the relationship between the dependent and independent variable exists for all categories of the dependent variable (i.e., not only one category is responsible for the observed effect)? 2. The likelihood-ratio test is comparable to the overall F test in OLS regression and tests whether there is a relationship between the dependent and independent variable in the population and therefore provides evidence that the results will generalize to other sample? @Bruce: Thanks for the suggestions, I thought about this before as well…but can I just drop some “inconvenient cases” Many thanks for the help and best wishes, Stefan |
Administrator
|
Hi Stefan. You'd only be dropping them in order to compare that model to one that includes them. That's not the same thing as ignoring them completely. This is in essence what measures like Cook's Distance do, although in that case, it leaves out one observation at a time. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
In reply to this post by Ryan
I responded to Ryan off-list to ask if he meant to say that the parameter estimates are interpreted as the change in the log(odds) relative to a reference category. He responded that he did indeed mean log(risk), not log(odds); and we have been having a vigorous back and forth discussion since, exchanging examples and links. Here is one link I sent to Ryan:
http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm#estimates And here's one he just sent to me (which I've not read yet--it's bed time here). http://www.columbia.edu/~so33/SusDev/Lecture_10.pdf Just thought I'd post this, in case anyone else was interested. I may have some more to say after reading that last document. Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Bruce et al.,
I have little doubt that the parameter estimates obtained from a generalized logits multinomial regression without any predictors yield log(relative risks), and assuming predictors are in the model, relative risk ratios. Allow me to provide a couple simple examples (without a predictor and with a dichotomous predictor) here to provide evidence in support of what I've stated. But before I do, let's make sure we all agree on some basic definitions within a logistic regression framework: Risk_A = probability of event A Risk_B = probability of event B Risk_C = probability of event C Relative Risk_A_B = Risk A / Risk B Relative Risk_A_C = Risk A / Risk C Relative Risk_B_C = Risk B / Risk C Odds_A = probability of event A / probability of not event A Odds_B = probability of event B / probability of not event B Odds_C = probability of event C / probability of not event C Odds Ratio_A_B = Odds_A / Odds_B Odds Ratio_A_C = Odds_A / Odds_C Odds Ratio_B_C = Odds_B / Odds_C Now, the first example I provide below shows that the parameter estimates obtained from the generalized logits multinomial regression model with no predictors below are equivalent to log(Relative Risks). data list list / Y. begin data 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 end data. FREQUENCIES VARIABLES=Y /ORDER=ANALYSIS. COMPUTE RR_1_3_raw = 16.666666666666664 / 50. COMPUTE RR_2_3_raw = 33.33333333333333 / 50. EXECUTE. NOMREG Y (BASE = LAST ORDER=ASCENDING) /MODEL /INTERCEPT=INCLUDE /PRINT=PARAMETER . COMPUTE RR_1_3 = exp(-1.0986122886681096). COMPUTE RR_2_3 = exp(-0.4054651081081645). EXECUTE. It should be clear from the example above that the parameter estimates can certainly be interpreted as log(Relative Risks). Those calculated from CROSSTABS using the definitional formulas are exactly the same as those output from NOMREG. Now, let's add a dichotomous predictor to the model to see what happens. I provide further comments after this code. data list list / Y X. begin data 1 1 1 1 1 0 2 1 2 1 2 1 2 0 2 1 2 1 3 1 3 1 3 0 3 1 3 1 3 1 3 0 3 1 3 1 end data. CROSSTABS /TABLES=Y BY X /FORMAT=AVALUE TABLES /CELLS=COUNT ROW COLUMN /COUNT ROUND CELL. COMPUTE RR_1_3_X0_Raw = (25 / 50) . COMPUTE RR_1_3_X1_Raw = (14.285714285714285 / 50). COMPUTE RRR_1_3_Raw = RR_1_3_X1_Raw / RR_1_3_X0_Raw. EXECUTE. COMPUTE RR_2_3_X0_Raw = (25 / 50) . COMPUTE RR_2_3_X1_Raw = (35.714285714285715 / 50). COMPUTE RRR_2_3_Raw = RR_2_3_X1_Raw / RR_2_3_X0_Raw. EXECUTE. NOMREG Y (BASE = LAST ORDER=ASCENDING) WITH X /MODEL X /INTERCEPT=INCLUDE /PRINT=PARAMETER . COMPUTE RRR_1_3 = exp(-0.5596157879354635). COMPUTE RRR_2_3 = exp(0.35667494393875165). EXECUTE. Again, I calculated the estimates using the probability estimates from CROSSTABS. Then I compared those estimates to exponentiated estimates from NOMREG. As expected, they [relative risk ratios] are identical. Ryan On Wed, Dec 1, 2010 at 10:07 PM, Bruce Weaver <[hidden email]> wrote: > I responded to Ryan off-list to ask if he meant to say that the parameter > estimates are interpreted as the change in the log(odds) relative to a > reference category. � He responded that he did indeed mean log(risk), not > log(odds); and we have been having a vigorous back and forth discussion > since, exchanging examples and links. � Here is one link I sent to Ryan: > > � http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm#estimates > > And here's one he just sent to me (which I've not read yet--it's bed time > here). > > � http://www.columbia.edu/~so33/SusDev/Lecture_10.pdf > > Just thought I'd post this, in case anyone else was interested. � I may have > some more to say after reading that last document. > > Cheers, > Bruce > > > R B wrote: >> >> Stefan, >> >> What do you mean by the following statement?: "...if the overall model >> is significant, I can conclude that there is a significant >> relationship between the dependent and independent variable for all >> categories of the dependent variable?" >> >> In the typical multinomial logistic regression assuming a single >> continuous predictor, X, the parameter estimates are interpreted as >> the change in the log(risk relative to the reference category), given >> a one-point increase in X. The parameter estimates do NOT reflect a >> change in the log(risk) of observing each category, given a one-point >> increase in X. Moreover, it's certainly possible to observe a >> non-significant log(relative risk) in the presence an overall >> significant model effect. >> >> Ryan >> >> On Tue, Nov 30, 2010 at 7:49 AM, s-volk <[hidden email]> >> wrote: >>> Dear all, >>> >>> I ran a multinomial logistic regression analysis with one continuous >>> independent variable. I have a sample size of 68 subjects (psychological >>> experiment) which end up split into 5 categories ranging in size from 5 >>> to >>> 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and >>> Model >>> Fit χ2=19.71, p<0.01. >>> >>> Now here is the problem: A reviewer complains that my results may be >>> sample >>> specific because one of the 5 categories of the dependent variable >>> consists >>> of only 5 observations (subjects), i.e. sh/e argues that very few >>> participants (five) are responsible for the observed effects. Is this >>> valid >>> argument? I thought that if the overall model is significant, I can >>> conclude >>> that there is a significant relationship between the dependent and >>> independent variable for all categories of the dependent variable? That >>> is, >>> the calculations for the overall model are based on all observations (68) >>> and not only on the observations in specific categories (e.g., 5)? >>> >>> I was wondering if someone could provide me with or point me to some >>> arguments for reviewers (ideally including some references)? >>> >>> Many thanks in advance, >>> Stefan >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3286013.html >>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> [hidden email] (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> >> > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3288831.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Also, for those interested, the UCLA website that describes how to
interpret output from a multinomial logistic regression with predictors in Stata refers to the exponentiated parameter estimates as relative risk ratios. Go to the bottom of the page for details. http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm Ryan On Wed, Dec 1, 2010 at 11:33 PM, R B <[hidden email]> wrote: > Bruce et al., > > I have little doubt that the parameter estimates obtained from a > generalized logits multinomial regression without any predictors yield > log(relative risks), and assuming predictors are in the model, > relative risk ratios. Allow me to provide a couple simple examples > (without a predictor and with a dichotomous predictor) here to provide > evidence in support of what I've stated. But before I do, let's make > sure we all agree on some basic definitions within a logistic > regression framework: > > Risk_A = probability of event A > Risk_B = probability of event B > Risk_C = probability of event C > > Relative Risk_A_B = Risk A / Risk B > Relative Risk_A_C = Risk A / Risk C > Relative Risk_B_C = Risk B / Risk C > > Odds_A = probability of event A / probability of not event A > Odds_B = probability of event B / probability of not event B > Odds_C = probability of event C / probability of not event C > > Odds Ratio_A_B = Odds_A / Odds_B > Odds Ratio_A_C = Odds_A / Odds_C > Odds Ratio_B_C = Odds_B / Odds_C > > Now, the first example I provide below shows that the parameter > estimates obtained from the generalized logits multinomial regression > model with no predictors below are equivalent to log(Relative Risks). > > data list list / Y. > begin data > 1 > 1 > 1 > 2 > 2 > 2 > 2 > 2 > 2 > 3 > 3 > 3 > 3 > 3 > 3 > 3 > 3 > 3 > end data. > > FREQUENCIES VARIABLES=Y > /ORDER=ANALYSIS. > > COMPUTE RR_1_3_raw = 16.666666666666664 / 50. > COMPUTE RR_2_3_raw = 33.33333333333333 / 50. > EXECUTE. > > NOMREG Y (BASE = LAST ORDER=ASCENDING) > /MODEL > /INTERCEPT=INCLUDE > /PRINT=PARAMETER . > > COMPUTE RR_1_3 = exp(-1.0986122886681096). > COMPUTE RR_2_3 = exp(-0.4054651081081645). > EXECUTE. > > It should be clear from the example above that the parameter estimates > can certainly be interpreted as log(Relative Risks). Those calculated > from CROSSTABS using the definitional formulas are exactly the same as > those output from NOMREG. Now, let's add a dichotomous predictor to > the model to see what happens. I provide further comments after this > code. > > data list list / Y X. > begin data > 1 1 > 1 1 > 1 0 > 2 1 > 2 1 > 2 1 > 2 0 > 2 1 > 2 1 > 3 1 > 3 1 > 3 0 > 3 1 > 3 1 > 3 1 > 3 0 > 3 1 > 3 1 > end data. > > CROSSTABS > /TABLES=Y BY X > /FORMAT=AVALUE TABLES > /CELLS=COUNT ROW COLUMN > /COUNT ROUND CELL. > > COMPUTE RR_1_3_X0_Raw = (25 / 50) . > COMPUTE RR_1_3_X1_Raw = (14.285714285714285 / 50). > COMPUTE RRR_1_3_Raw = RR_1_3_X1_Raw / RR_1_3_X0_Raw. > EXECUTE. > > COMPUTE RR_2_3_X0_Raw = (25 / 50) . > COMPUTE RR_2_3_X1_Raw = (35.714285714285715 / 50). > COMPUTE RRR_2_3_Raw = RR_2_3_X1_Raw / RR_2_3_X0_Raw. > EXECUTE. > > NOMREG Y (BASE = LAST ORDER=ASCENDING) WITH X > /MODEL X > /INTERCEPT=INCLUDE > /PRINT=PARAMETER . > > COMPUTE RRR_1_3 = exp(-0.5596157879354635). > COMPUTE RRR_2_3 = exp(0.35667494393875165). > EXECUTE. > > Again, I calculated the estimates using the probability estimates from > CROSSTABS. Then I compared those estimates to exponentiated estimates > from NOMREG. As expected, they [relative risk ratios] are identical. > > Ryan > > On Wed, Dec 1, 2010 at 10:07 PM, Bruce Weaver <[hidden email]> wrote: >> I responded to Ryan off-list to ask if he meant to say that the parameter >> estimates are interpreted as the change in the log(odds) relative to a >> reference category. He responded that he did indeed mean log(risk), not >> log(odds); and we have been having a vigorous back and forth discussion >> since, exchanging examples and links. Here is one link I sent to Ryan: >> >> http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm#estimates >> >> And here's one he just sent to me (which I've not read yet--it's bed time >> here). >> >> http://www.columbia.edu/~so33/SusDev/Lecture_10.pdf >> >> Just thought I'd post this, in case anyone else was interested. I may have >> some more to say after reading that last document. >> >> Cheers, >> Bruce >> >> >> R B wrote: >>> >>> Stefan, >>> >>> What do you mean by the following statement?: "...if the overall model >>> is significant, I can conclude that there is a significant >>> relationship between the dependent and independent variable for all >>> categories of the dependent variable?" >>> >>> In the typical multinomial logistic regression assuming a single >>> continuous predictor, X, the parameter estimates are interpreted as >>> the change in the log(risk relative to the reference category), given >>> a one-point increase in X. The parameter estimates do NOT reflect a >>> change in the log(risk) of observing each category, given a one-point >>> increase in X. Moreover, it's certainly possible to observe a >>> non-significant log(relative risk) in the presence an overall >>> significant model effect. >>> >>> Ryan >>> >>> On Tue, Nov 30, 2010 at 7:49 AM, s-volk <[hidden email]> >>> wrote: >>>> Dear all, >>>> >>>> I ran a multinomial logistic regression analysis with one continuous >>>> independent variable. I have a sample size of 68 subjects (psychological >>>> experiment) which end up split into 5 categories ranging in size from 5 >>>> to >>>> 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and >>>> Model >>>> Fit χ2=19.71, p<0.01. >>>> >>>> Now here is the problem: A reviewer complains that my results may be >>>> sample >>>> specific because one of the 5 categories of the dependent variable >>>> consists >>>> of only 5 observations (subjects), i.e. sh/e argues that very few >>>> participants (five) are responsible for the observed effects. Is this >>>> valid >>>> argument? I thought that if the overall model is significant, I can >>>> conclude >>>> that there is a significant relationship between the dependent and >>>> independent variable for all categories of the dependent variable? That >>>> is, >>>> the calculations for the overall model are based on all observations (68) >>>> and not only on the observations in specific categories (e.g., 5)? >>>> >>>> I was wondering if someone could provide me with or point me to some >>>> arguments for reviewers (ideally including some references)? >>>> >>>> Many thanks in advance, >>>> Stefan >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3286013.html >>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> [hidden email] (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> [hidden email] (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >>> >> >> >> ----- >> -- >> Bruce Weaver >> [hidden email] >> http://sites.google.com/a/lakeheadu.ca/bweaver/ >> >> "When all else fails, RTFM." >> >> NOTE: My Hotmail account is not monitored regularly. >> To send me an e-mail, please use the address shown above. >> >> -- >> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3288831.html >> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Ryan
Here is an example demonstrating equivalence of Exp(B) from NOMREG with odds ratios computed via CROSSTABS.
Cheers, Bruce * Multinomial logistic regression on a 4x3 table. * The data are from http://www.angelfire.com/wv/bwhomedir/notes/multinomial_log_reg.pdf . data list list / X Y kount (3f5.0). begin data 1 1 6 1 2 8 1 3 38 2 1 13 2 2 29 2 3 55 3 1 20 3 2 33 3 3 160 4 1 51 4 2 42 4 3 518 end data. var lab x 'Functional status' y 'ICU Code Status' . val lab x 1 'Unknown' 2 'Severely limited' 3 'Somewhat limited' 4 'Totally independent' / y 1 'Explicit: Resuscitate' 2 'Explicit: DNR' 3 'Implicit: Resuscitate' . weight by kount. crosstabs x by y . NOMREG Y (BASE=LAST ORDER=ASCENDING) BY X /MODEL = X /INTERCEPT=INCLUDE /PRINT=PARAMETER SUMMARY LRT STEP MFI. * Notice that the Exp(B) values from this model match exactly * the odds ratios computed in the document. * Now compute the same odds ratios via CROSSTABS. * First OR. temporary. select if any(X,1,4) and any(Y,1,3). crosstabs x by y / stat = risk. * Second OR. temporary. select if any(X,2,4) and any(Y,1,3). crosstabs x by y / stat = risk. * Third OR. temporary. select if any(X,3,4) and any(Y,1,3). crosstabs x by y / stat = risk. * Fourth OR. temporary. select if any(X,1,4) and any(Y,2,3). crosstabs x by y / stat = risk. * Fifth OR. temporary. select if any(X,2,4) and any(Y,2,3). crosstabs x by y / stat = risk. * Sixth OR. temporary. select if any(X,3,4) and any(Y,2,3). crosstabs x by y / stat = risk. * Notice that the odds ratios & 95% confidence intervals * obtained via CROSSTABS match exactly the values of * Exp(B) from NOMREG and their 95% confidence intervals.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Bruce,
I decided to calculate what you call "First OR" using the RRR formula I presented previously. Lo and behold, our estimates are the same. RRR1 = ((6 / (6 + 8 + 38)) / (38 / (6 + 8 + 38))) / ((51 / (51 + 42 + 518)) / (518 / (51 + 42 + 518))) = 1.6037151703 You can see in the formula above that I am calculating two relative risk estimates and then dividing those two estimates. I am comfortable interpreting this estimate as a relative risk ratio because of the way I set up the equation. Interpretation of this estimate as a RRR is common IMO. Having said that, I have also seen places where this estimate is interpreted as an odds ratio. Best wishes, Ryan On Thu, Dec 2, 2010 at 11:37 AM, Bruce Weaver <[hidden email]> wrote: > Here is an example demonstrating equivalence of Exp(B) from NOMREG with odds > ratios computed via CROSSTABS. > > Cheers, > Bruce > > * Multinomial logistic regression on a 4x3 table. > * The data are from > http://www.angelfire.com/wv/bwhomedir/notes/multinomial_log_reg.pdf . > > data list list / X Y kount (3f5.0). > begin data > 1 1 6 > 1 2 8 > 1 3 38 > 2 1 13 > 2 2 29 > 2 3 55 > 3 1 20 > 3 2 33 > 3 3 160 > 4 1 51 > 4 2 42 > 4 3 518 > end data. > > var lab > � x 'Functional status' > � y 'ICU Code Status' > . > val lab > � x 1 'Unknown' > � 2 'Severely limited' > � 3 'Somewhat limited' > � 4 'Totally independent' / > � y 1 'Explicit: Resuscitate' > � 2 'Explicit: DNR' > � 3 'Implicit: Resuscitate' > . > > weight by kount. > crosstabs x by y . > > NOMREG Y (BASE=LAST ORDER=ASCENDING) BY X > � /MODEL = X > � /INTERCEPT=INCLUDE > � /PRINT=PARAMETER SUMMARY LRT STEP MFI. > > * Notice that the Exp(B) values from this model match exactly > * the odds ratios computed in the document. > > * Now compute the same odds ratios via CROSSTABS. > > * First OR. > temporary. > select if any(X,1,4) and any(Y,1,3). > crosstabs x by y / stat = risk. > > * Second OR. > temporary. > select if any(X,2,4) and any(Y,1,3). > crosstabs x by y / stat = risk. > > * Third OR. > temporary. > select if any(X,3,4) and any(Y,1,3). > crosstabs x by y / stat = risk. > > * Fourth OR. > temporary. > select if any(X,1,4) and any(Y,2,3). > crosstabs x by y / stat = risk. > > * Fifth OR. > temporary. > select if any(X,2,4) and any(Y,2,3). > crosstabs x by y / stat = risk. > > * Sixth OR. > temporary. > select if any(X,3,4) and any(Y,2,3). > crosstabs x by y / stat = risk. > > * Notice that the odds ratios & 95% confidence intervals > * obtained via CROSSTABS match exactly the values of > * Exp(B) from NOMREG and their 95% confidence intervals. > > > > > R B wrote: >> >> Bruce et al., >> >> I have little doubt that the parameter estimates obtained from a >> generalized logits multinomial regression without any predictors yield >> log(relative risks), and assuming predictors are in the model, >> relative risk ratios. Allow me to provide a couple simple examples >> (without a predictor and with a dichotomous predictor) here to provide >> evidence in support of what I've stated. But before I do, let's make >> sure we all agree on some basic definitions within a logistic >> regression framework: >> >> Risk_A = probability of event A >> Risk_B = probability of event B >> Risk_C = probability of event C >> >> Relative Risk_A_B = Risk A / Risk B >> Relative Risk_A_C = Risk A / Risk C >> Relative Risk_B_C = Risk B / Risk C >> >> Odds_A = probability of event A / probability of not event A >> Odds_B = probability of event B / probability of not event B >> Odds_C = probability of event C / probability of not event C >> >> Odds Ratio_A_B = Odds_A / Odds_B >> Odds Ratio_A_C = Odds_A / Odds_C >> Odds Ratio_B_C = Odds_B / Odds_C >> >> Now, the first example I provide below shows that the parameter >> estimates obtained from the generalized logits multinomial regression >> model with no predictors below are equivalent to log(Relative Risks). >> >> data list list / Y. >> begin data >> 1 >> 1 >> 1 >> 2 >> 2 >> 2 >> 2 >> 2 >> 2 >> 3 >> 3 >> 3 >> 3 >> 3 >> 3 >> 3 >> 3 >> 3 >> end data. >> >> FREQUENCIES VARIABLES=Y >> /ORDER=ANALYSIS. >> >> COMPUTE RR_1_3_raw = 16.666666666666664 / 50. >> COMPUTE RR_2_3_raw = 33.33333333333333 � / 50. >> EXECUTE. >> >> NOMREG Y (BASE = LAST ORDER=ASCENDING) >> /MODEL >> /INTERCEPT=INCLUDE >> /PRINT=PARAMETER . >> >> COMPUTE RR_1_3 = exp(-1.0986122886681096). >> COMPUTE RR_2_3 = exp(-0.4054651081081645). >> EXECUTE. >> >> It should be clear from the example above that the parameter estimates >> can certainly be interpreted as log(Relative Risks). Those calculated >> from CROSSTABS using the definitional formulas are exactly the same as >> those output from NOMREG. Now, let's add a dichotomous predictor to >> the model to see what happens. I provide further comments after this >> code. >> >> data list list / Y X. >> begin data >> 1 1 >> 1 1 >> 1 0 >> 2 1 >> 2 1 >> 2 1 >> 2 0 >> 2 1 >> 2 1 >> 3 1 >> 3 1 >> 3 0 >> 3 1 >> 3 1 >> 3 1 >> 3 0 >> 3 1 >> 3 1 >> end data. >> >> CROSSTABS >> � /TABLES=Y BY X >> � /FORMAT=AVALUE TABLES >> � /CELLS=COUNT ROW COLUMN >> � /COUNT ROUND CELL. >> >> COMPUTE RR_1_3_X0_Raw = (25 / 50) . >> COMPUTE RR_1_3_X1_Raw = (14.285714285714285 / 50). >> COMPUTE RRR_1_3_Raw = RR_1_3_X1_Raw / RR_1_3_X0_Raw. >> EXECUTE. >> >> COMPUTE RR_2_3_X0_Raw = (25 / 50) . >> COMPUTE RR_2_3_X1_Raw = (35.714285714285715 / 50). >> COMPUTE RRR_2_3_Raw = RR_2_3_X1_Raw / RR_2_3_X0_Raw. >> EXECUTE. >> >> NOMREG Y (BASE = LAST ORDER=ASCENDING) WITH X >> /MODEL X >> /INTERCEPT=INCLUDE >> /PRINT=PARAMETER . >> >> COMPUTE RRR_1_3 = exp(-0.5596157879354635). >> COMPUTE RRR_2_3 = exp(0.35667494393875165). >> EXECUTE. >> >> Again, I calculated the estimates using the probability estimates from >> CROSSTABS. Then I compared those estimates to exponentiated estimates >> from NOMREG. As expected, they [relative risk ratios] are identical. >> >> Ryan >> >> On Wed, Dec 1, 2010 at 10:07 PM, Bruce Weaver <[hidden email]> >> wrote: >>> I responded to Ryan off-list to ask if he meant to say that the parameter >>> estimates are interpreted as the change in the log(odds) relative to a >>> reference category. � He responded that he did indeed mean log(risk), not >>> log(odds); and we have been having a vigorous back and forth discussion >>> since, exchanging examples and links. � Here is one link I sent to Ryan: >>> >>> � � http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm#estimates >>> >>> And here's one he just sent to me (which I've not read yet--it's bed time >>> here). >>> >>> � � http://www.columbia.edu/~so33/SusDev/Lecture_10.pdf >>> >>> Just thought I'd post this, in case anyone else was interested. � I may >>> have >>> some more to say after reading that last document. >>> >>> Cheers, >>> Bruce >>> >>> >>> R B wrote: >>>> >>>> Stefan, >>>> >>>> What do you mean by the following statement?: "...if the overall model >>>> is significant, I can conclude that there is a significant >>>> relationship between the dependent and independent variable for all >>>> categories of the dependent variable?" >>>> >>>> In the typical multinomial logistic regression assuming a single >>>> continuous predictor, X, the parameter estimates are interpreted as >>>> the change in the log(risk relative to the reference category), given >>>> a one-point increase in X. The parameter estimates do NOT reflect a >>>> change in the log(risk) of observing each category, given a one-point >>>> increase in X. Moreover, it's certainly possible to observe a >>>> non-significant log(relative risk) in the presence an overall >>>> significant model effect. >>>> >>>> Ryan >>>> >>>> On Tue, Nov 30, 2010 at 7:49 AM, s-volk <[hidden email]> >>>> wrote: >>>>> Dear all, >>>>> >>>>> I ran a multinomial logistic regression analysis with one continuous >>>>> independent variable. I have a sample size of 68 subjects >>>>> (psychological >>>>> experiment) which end up split into 5 categories ranging in size from 5 >>>>> to >>>>> 24 (dependent variable). The MLR-model has a Nagelkerke R2 of 0.27 and >>>>> Model >>>>> Fit χ2=19.71, p<0.01. >>>>> >>>>> Now here is the problem: A reviewer complains that my results may be >>>>> sample >>>>> specific because one of the 5 categories of the dependent variable >>>>> consists >>>>> of only 5 observations (subjects), i.e. sh/e argues that very few >>>>> participants (five) are responsible for the observed effects. Is this >>>>> valid >>>>> argument? I thought that if the overall model is significant, I can >>>>> conclude >>>>> that there is a significant relationship between the dependent and >>>>> independent variable for all categories of the dependent variable? That >>>>> is, >>>>> the calculations for the overall model are based on all observations >>>>> (68) >>>>> and not only on the observations in specific categories (e.g., 5)? >>>>> >>>>> I was wondering if someone could provide me with or point me to some >>>>> arguments for reviewers (ideally including some references)? >>>>> >>>>> Many thanks in advance, >>>>> Stefan >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3286013.html >>>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>>>> >>>>> ===================== >>>>> To manage your subscription to SPSSX-L, send a message to >>>>> [hidden email] (not to SPSSX-L), with no body text except >>>>> the >>>>> command. To leave the list, send the command >>>>> SIGNOFF SPSSX-L >>>>> For a list of commands to manage subscriptions, send the command >>>>> INFO REFCARD >>>>> >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> [hidden email] (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>>> >>> >>> >>> ----- >>> -- >>> Bruce Weaver >>> [hidden email] >>> http://sites.google.com/a/lakeheadu.ca/bweaver/ >>> >>> "When all else fails, RTFM." >>> >>> NOTE: My Hotmail account is not monitored regularly. >>> To send me an e-mail, please use the address shown above. >>> >>> -- >>> View this message in context: >>> http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3288831.html >>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> [hidden email] (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> >> > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multinomial-Logistic-Regression-Category-Size-tp3286013p3289655.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Fair enough, Ryan. I've still not had time to look at those lecture notes you gave a link for, but will try to get to them sometime soon. Meanwhile, I'll keep calling it an odds ratio. ;-)
Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |