I am no expert on Cox regression, so this question could turn out to be a silly one. Still, I need some help here.
When you do Cox regression where one of the independent variables is a categorical one with three or more categories, then you get a p-value for the reference category. What does this p-value mean? Robert ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Robert Lundqvist
|
Robert,
If I understand the question, the reference category does not explicitly appear in the Cox Regression output and therefore there is no p-value for the reference category. The situation in Cox Regression is similar to what you would encounter in multiple regression or logistic regression. That is, with a nominal covariate, you would create a series of 0,1 dichotomies. To make things concrete, consider that you wish to enter AGE in the model, but instead of treating it continuously you create dummies with the following coding: AGE GROUP AGE_2 AGE_3 AGE_4 <60 0 0 0 60-69 1 0 0 70-79 0 1 0 80+ 0 0 1 You would enter AGE_2, AGE_3, and AGE_4 in the model. You would obtain regression-like output that shows coefficients, standard errors, p-values, and the like for these 3 variables. For interpretation, the exp(b) column is useful. For example, if AGE_4 has an exp(b) value of 3.54, then subjects 80 or older have a terminal event rate that is about 3.5 times greater than subjects younger than 60. The p-value would be associated with AGE_4. As in other forms of regression, you can obtain a test for the entire block AGE_2, AGE_3, AGE_4. One difference between Cox Regression and other regressions is that there is no explicit intercept term in the model. This is a consequence of the form of the Cox Regression model. The Hosmer, Lemeshow, May book on Survival Analysis has a good discussion of nominal covariates. Tony Babinec -----Original Message----- From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Robert Lundqvist Sent: Tuesday, December 8, 2020 6:04 AM To: [hidden email] Subject: Cox regression, p-value for categorical variable? I am no expert on Cox regression, so this question could turn out to be a silly one. Still, I need some help here. When you do Cox regression where one of the independent variables is a categorical one with three or more categories, then you get a p-value for the reference category. What does this p-value mean? Robert ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Robert L
Thanks Anthony for the response, I will look into the details. But there actually *is* an explicit p-value for the reference category, it is found in the "Variables in the equation" table.
A silly dataset and syntax: DATA LIST LIST/time(F2) status(F1) edlevel(F1). BEGIN DATA 14 0 1 24 1 2 34 0 3 15 1 1 36 0 2 24 0 3 END DATA. DATASET NAME coxreg WINDOW=FRONT. COXREG time /STATUS=status(1) /CONTRAST (edlevel)=Indicator /METHOD=ENTER edlevel /CRITERIA=PIN(.05) POUT(.10) ITERATE(20). The p-value for the reference category in my output, more specifically the "Variables in the equation" table is 0.767. Some kind of test seems to be calcaluted here, but what? Robert ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Robert Lundqvist
|
Administrator
|
Hi Robert. Notice that df=2 for that test and p-value. It is an overall
test for education level. The other tests in that table have df=1. The first tests level 1 vs 3, and the second level 2 vs 3. (See the Categorical Variable Codings table earlier in the output.) Note too that the Omnibus Tests of Model Coefficients shows another overall test for education level. But it shows a Likelihood Ratio test rather than a Wald test. Both are Chi-square tests, and both are typically described as "asymptotically equivalent" in textbooks. But many authors say that the LR test is preferred. HTH. Robert L wrote > The p-value for the reference category in my output, more specifically the > "Variables in the equation" table is 0.767. Some kind of test seems to be > calcaluted here, but what? > > Robert ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Robert L
Robert and I had a private exchange...
-----Original Message----- Robert, I ran your example. It is a "silly" example in the sense that there is too little data and I get a message about convergence, but I think that I get the results that you do. Edlevel has 3 categories. The last category(3) is the reference category. You can tell this by looking at the categorical variable coding table. Edlevel(1) is the contrast of category 1 versus 3. Edlevel(2) is the contrast of category 2 versus 3. The omnibus test of model coefficients table shows an overall 2-degree-of-freedom *likelihood ratio* test of the edlevel effect as a whole. TO YOUR QUESTION: The first line of the variables in the equation table shows an overall 2-degree-of-freedom *wald* test of the edlevel effect as a whole. In general, in nonlinear regression, there can be 3 tests of the same effect - likelihood ratio, wald, and score test. So, these are 2 tests of the same effect - the overall effect of edlevel. With such a small dataset, they have different p-values. edlevel(1) and edlevel(2) are tests of the individual contrasts listed above. The analogy could be to ANOVA, where the F test is the overall test of equality of means, while individual contrasts make specific mean comparison. Tony Babinec ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |