Administrator

Hello everyone and Happy New Year. I just read a question on ResearchGate about the REPEATED contrast option for COXREG, and must confess that I am rather confused by the output that was attached to the question. You can see it here:
https://www.researchgate.net/post/How_to_interpret_the_output_of_coxregression_using_the_CONTRAST_subcommand_repeated Here is what the v27 CSR manual entry for COXREGCONTRAST says about Repeated contrasts: "REPEATED. Comparison of adjacent categories. Each category of the independent variable except the last is compared to the next category." The categorical variable in question has 4 levels. Therefore, I would expect the following contrast coefficients for REPEATED contrasts: Contrast 1: 1 1 0 0 Contrast 2: 0 1 1 0 Contrast3: 0 0 1 1 But the output shows the following values: Contrast 1: .750 .250 .250 .250 Contrast 2: .500 .500 .500 .500 Contrast 3: .250 .250 .250 .750 How can all levels be included in each contrast when "Each category of the independent variable except the last is compared to the next category"? What am I missing here? Do REPEATED contrasts behave differently for COXREG than they do for UNIANOVA/GLM? Does it have something to do with the fact that COXREG does not include an intercept in the table of coefficients? Thanks to anyone who can explain what is going on here or point to some good resources. Cheers, Bruce

Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an email, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSXL listserv administered by UGA (https://listserv.uga.edu/). 
Administrator

Here is some code to play with for anyone who is interested in the question.
* Edit next line to point to the folder with the sample datasets. GET FILE='C:\SPSSdata\telco.sav'. DATASET NAME raw WINDOW=FRONT. COXREG tenure /STATUS=churn(1) /CONTRAST (ed)=Repeated /METHOD=ENTER ed /PRINT=CI(95) /CRITERIA=PIN(.05) POUT(.10) ITERATE(20). * Compute 4 variables with the coding shown above. FREQUENCIES ed. RECODE ed (1=.8) (2 3 4 5 = 0.2) INTO ed1. RECODE ed (1 2 =.6) (3 4 5 = 0.4) INTO ed2. RECODE ed (1 2 3 =.4) (4 5 = 0.6) INTO ed3. RECODE ed (1 2 3 4=.2) (5 = 0.8) INTO ed4. * Estimate model with 4 computed variables treated as covariates. COXREG tenure /STATUS=churn(1) /METHOD=ENTER ed1 ed2 ed3 ed4 /PRINT=CI(95) /CRITERIA=PIN(.05) POUT(.10) ITERATE(20). * The coefficients & SEs for ed1 to ed4 match * the coefficients & SEs from the previous model. * I would have expected the following coding for REPEATED contrasts. RECODE ed (1=1) (2=1) (ELSE=0) into rep1. RECODE ed (2=1) (3=1) (ELSE=0) into rep2. RECODE ed (3=1) (4=1) (ELSE=0) into rep3. RECODE ed (4=1) (5=1) (ELSE=0) into rep4. FORMATS rep1 to rep4 (F1). COXREG tenure /STATUS=churn(1) /METHOD=ENTER rep1 to rep4 /PRINT=CI(95) /CRITERIA=PIN(.05) POUT(.10) ITERATE(20). * But clearly, this way of coding the contrasts gives very different results. * Show that it works as expected using GLM/UNIANOVA. * First, include a REPEATED contrast and a series of * LMATRIX subcommands that make the same contrasts. UNIANOVA income BY ed /CONTRAST(ed)=Repeated /lmatrix "level 1 versus level 2" ed 1 1 0 0 0 /lmatrix "level 2 versus level 3" ed 0 1 1 0 0 /lmatrix "level 3 versus level 4" ed 0 0 1 1 0 /lmatrix "level 4 versus level 5" ed 0 0 0 1 1 /DESIGN=ed. * Second, use the rep1 to rep4 variables computed above. UNIANOVA income WITH rep1 to rep4 /PRINT=PARAMETER. * Hmm. Those results do not match the CONTRAST or LMATRIX results above. * Try again using ed1 to ed4 variables computed above. UNIANOVA income WITH ed1 to ed4 /PRINT=PARAMETER. * The coefficients for ed1 to ed4 do match the results from CONTRAST & LMATRIX. * I suppose it could have something to do with the unbalanced design. * I'll have to think about this some more.

Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an email, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSXL listserv administered by UGA (https://listserv.uga.edu/). 
Administrator

In reply to this post by Bruce Weaver
I found the answer to my own question.
* Aha. I found the solution. One must distinguish between * the hypothesis matrix and the contrast matrix. * "The generalized inverse of the hypothesis matrix yields the desired contrast matrix". * Source: https://www.theoj.org/josspapers/joss.02134/10.21105.joss.02134.pdf. * Let h = the hypothesis matrix for "repeated" contrasts with k = 4 levels. MATRIX. COMPUTE h = { 1, 1, 0, 0 ; 0, 1, 1, 0; 0, 0, 1, 1 }. COMPUTE c = GINV(h). PRINT h /TITLE="h = hypothesis matrix" /FORMAT=F5.0. PRINT c /TITLE="c = contrast matrix = GINV(h)" /FORMAT=F5.3. END MATRIX. OUTPUT: Run MATRIX procedure: h = hypothesis matrix 1 1 0 0 0 1 1 0 0 0 1 1 c = contrast matrix = GINV(h) .750 .500 .250 .250 .500 .250 .250 .500 .250 .250 .500 .750  END MATRIX 

Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an email, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSXL listserv administered by UGA (https://listserv.uga.edu/). 
In reply to this post by Bruce Weaver
Bruce,
The first matrix is contrast coefficient (L) matrix. The second matrix is the coding (C) matrix, i.e. values of the design matrix. They are related as the ginv() of each other (but not so for all types of contrasts). I discuss this in the description of my macro "Categorical into contrast" !KO_CATCONT which produces predictor variables for various contrast types. You could download from my page https://www.spsstools.net/en/macros/KOspssmacros/ 
In reply to this post by Bruce Weaver
As I've already remarked, you have been not correct in terminology (to my mind). What you call "hypothesis matrix" is the matrix of contrast coefficients aka contrast matrix L, for short. What you named "contrast matrix" is the coding aka basis matrix C, of the contrast variables.

Administrator

In reply to this post by Kirill Orlov
Hello Kirill. I was using the terminology I found in the article I cited. But I see that other resources use notation & terminology more in line with yours. E.g.,
https://bookdown.org/pingapang9/linear_models_bookdown/chapcontrasts.html#connectionbetweencontrastandcodingschemes Cheers, Bruce

Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an email, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSXL listserv administered by UGA (https://listserv.uga.edu/). 
Contrast matrix :: hypothesis matrix, this is an example of the importance of being aware of the variation in vocabulary from different subdisciplines.
more about variations in stat vocabulary. Kinds_of_variables.pdf
Art Kendall
Social Research Consultants 
In reply to this post by Bruce Weaver
Bruce, more importantly that my terminology is in line with SPSS's. Actually, I borrowed it once from David Nichols who was a senior statistician in SPSS Inc. Here is one of his articles for users.
https://stats.oarc.ucla.edu/spss/library/spsslibraryunderstandingandinterpretingparameterestimatesinregressionandanova/ and this one https://stats.oarc.ucla.edu/spss/library/spsslibraryunderstandingcontrasts/ 
Free forum by Nabble  Edit this page 