|
I'm attempting to use linear mixed models to analyze trends in several different measures over the course of a treatment (11 sessions), particularly to determine the effect of depression on the trajectories. When I run the LMM with the session number as the repeated measure, then the model does not acheive convergence. But when I do not enter anything in the repeated statement and use session number as a fixed effect, convergence is acheived. The resulting intercepts and betas from these models are slightly different, but they're all significant.
Here's some sample code:
With REPEATED:
MIXED ZBeck_sum BY Depressed WITH SessionNum
/CRITERIA=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULAR(0.000000000001) HCONVERGE(0, ABSOLUTE) LCONVERGE(0, ABSOLUTE) PCONVERGE(0.000001, ABSOLUTE) /FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /METHOD=REML /PRINT=COVB R SOLUTION TESTCOV /RANDOM=INTERCEPT | SUBJECT(Subj_ID) COVTYPE(VC) /REPEATED=SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). Without REPEATED:
MIXED ZBeck_sum BY Depressed WITH SessionNum
/CRITERIA=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULAR(0.000000000001) HCONVERGE(0, ABSOLUTE) LCONVERGE(0, ABSOLUTE) PCONVERGE(0.000001, ABSOLUTE) /FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /METHOD=REML /PRINT=COVB R SOLUTION TESTCOV /RANDOM=INTERCEPT SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). I just don't understand the difference between using the REPEATED statement and using FIXED. Would anyone please explain this?
Thanks.
Karen R. Harker, MLS, MPH
Biostatistical Consultant
Adolescent Mood and Addictive Disorders Research Program UT Southwestern Medical Center
5323 Harry Hines Blvd. Dallas, TX 75390- 214-648-5391 Yahoo IM: karenharker |
|
Karen,
I assume your data set looks like this:
Subj_ID Depressed SessionNum ZBeck_sum
1 1 1 22
1 1 2 11
1 1 3 8
.
.
.
1 1 7 34
2 1 1 23
2 1 2 11
2 1 3 4
.
.
.
2 2 7 2
3 2 1 6
3 2 2 6
3 2 3 5
.
.
.
N
Z_Beck_sum may be a standardized variable, so the values I've assigned may be out of range.
Let's go over the first set of code you posted (Note that I removed the default convergence criteria):
MIXED ZBeck_sum BY Depressed WITH SessionNum
/FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /METHOD=REML /PRINT=COVB R SOLUTION TESTCOV /RANDOM=INTERCEPT | SUBJECT(Subj_ID) COVTYPE(VC) /REPEATED=SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). This code indicates that your dependent variable is ZBeck_Sum, and that you have two independent variables, one of which is categorical (Depressed) while the other is continuous (SessionNum). I have concerns about using SessionNum (which I presume means session number) as a continuous variable, but I'll look past that for now. The FIXED statement indicates that you have assigned fixed effects to Depressed SessionNum and their interaction. Your RANDOM statement and REPEATED statement seem reasonable to me. Having stated that, it is possible that a more parsiminous structure could be specified in the REPEATED statement without losing model fit (e.g. autoregressive), though the autoregressive structure assumes the intervals between adjacent sessions are equal in length. It also, of course, assumes a decaying correlation as time between sessions increases. Both REPEATED and RANDOM statements in this model would be essentially redundant under the special circumstance where the correlation between observations for each subject is equal. If that's the case, then you can probably just drop the REPEATED statement. Let's move on to the next chunk of code where you have actually dropped the REPEATED statement:
MIXED ZBeck_sum BY Depressed WITH SessionNum
/FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /METHOD=REML /PRINT=COVB R SOLUTION TESTCOV /RANDOM=INTERCEPT SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). In this code, everything appears to be the same except that you have removed the REPEATED statement and added in a random SessionNum slope. Note that you are also allowing for the random effects (intercepts and slopes) to be correlated. You are specifying that SessionNum has both a fixed and random effect. To your question about the difference between the FIXED and REPEATED statement. The FIXED statement specifies the fixed effects while your REPEATED statement specifies the structure of the residual variance. If you treat a variable as having a fixed effect, then you are assuming that the coefficient associated with this fixed effect are *not* random values (they are fixed). If you were treat a variable as having a random effect, then the coefficient would be considered random. It is certainly possible to assume that a variable has both a fixed effect and random effect within the same model, as you have done with SessionNum.
Recall that the model variance-covariance matrix is made up of both the random effects variance-covariance matrix (G-matrix) and the residual variance-covariance matrix (R-matrix). Again, the REPEATED statement pertains to the residual (R) matrix, while the RANDOM statement pertains to the G matrix.
It isn't surprising to me that the fixed effects estimates are different when you run both pieces of code. You are fitting different models.
Ryan
On Thu, Jun 10, 2010 at 2:01 PM, Karen Harker <[hidden email]> wrote:
|
|
Grammatical correction to previous post (IN CAPS):
"...you treat a variable as having a fixed effect, then you are assuming that the coefficient associated with this fixed effect IS *not* A random value (IT IS fixed).
Ryan
On Mon, Jun 14, 2010 at 9:55 AM, Ryan Black <[hidden email]> wrote:
|
|
In reply to this post by Ryan
Ryan,
I wonder if you can help me understand two things that come up in your reply to Karen. The first is the equations implied by the model statements. In Karen's second model, this one, MIXED ZBeck_sum BY Depressed WITH SessionNum /FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /RANDOM=INTERCEPT SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). The level 1 model equation is just zbeck_sum(i,j) = b0(i) + b1(i)*sessionnum + e(i). Where j is session and i is person. But in her first model, what are the level 1 and 2 model equations? Can you help me understand that? MIXED ZBeck_sum BY Depressed WITH SessionNum /FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /RANDOM=INTERCEPT | SUBJECT(Subj_ID) COVTYPE(VC) /REPEATED=SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). Thanks, Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Gene, I'm not sure why you removed the other two fixed effects when you wrote out the equation. If you want to only include the SessionNum fixed effect (treating SessionNum as a continuous variable) while accounting for a random intercept and slope, then you would write out the code as: MIXED ZBeck_sum WITH SessionNum
/FIXED=SessionNum | SSTYPE(3) /RANDOM=INTERCEPT SessionNum | SUBJECT(Subj_ID) COVTYPE({Specify COV MATRIX}). and I would write out the equations as:
Level 1 Equation:
ZBeck_sum = B0J + B1J*SessionNum + eij
where
B0J is the random intercept
B1J is the random slope
eij is the error term
Level 2 Equations:
B0J = Gamma00 + u0j
B1J = Gamma10 + u1j
where
B0J (random intercept) is a function of the fixed intercept, Gamma00, and error term, u0j
B1J (random slope) is a function of the fixed slope, Gamma10, and error term, u1j
Full Equation (using substitution):
ZBeck_sum = Gamma00 + Gamma10*SessionNum
+ ( u1j*SessionNum + u0j + eij )
Ryan
On Mon, Jun 14, 2010 at 11:00 AM, Gene Maguin <[hidden email]> wrote:
Ryan, |
|
In reply to this post by Ryan
Thanks for your reply. Your assumptions about my data were correct and your explanations make sense.
BTW, SessionNum is continuous and ranges from 1 to 11, but your concern about treating it as continuous is noted. It is not normally-distributed, so now I am concerned, as well. The time-interval between sessions is one week. I ran a model treating SessionNum as a categorical factor, but the model is less parsimonious - more levels & higher information criteria.
The code I listed indicates an unstructured covariance matrix structure, but only to get a sense of what the structure is really like. For Beck, the correlation does not seem have a recurring pattern across sessions; for other measures (e.g. # of cigarettes smoked), the correlations do diminish over time.
You noted that you removed the convergence criteria. This has been a problem when I fitted some of my models of different dependent variables (e.g. Beck or # of cigarettes smoked), in that convergence is not always achieved. I end up changing my model in order to achieve convergence. Should I be doing that, based on the assumption that a model that doesn't reach convergence doesn't make sense? Usually, the changes I make are how SessionNum is treated (e.g. as REPEATED or as RANDOM or both).
Thanks, again.
Karen,
I assume your data set looks like this:
Subj_ID Depressed SessionNum ZBeck_sum
1 1 1 22
1 1 2 11
1 1 3 8
.
.
.
1 1 7 34
2 1 1 23
2 1 2 11
2 1 3 4
.
.
.
2 2 7 2
3 2 1 6
3 2 2 6
3 2 3 5
.
.
.
N
Z_Beck_sum may be a standardized variable, so the values I've assigned may be out of range.
Let's go over the first set of code you posted (Note that I removed the default convergence criteria):
MIXED ZBeck_sum BY Depressed WITH SessionNum
/FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /METHOD=REML /PRINT=COVB R SOLUTION TESTCOV /RANDOM=INTERCEPT | SUBJECT(Subj_ID) COVTYPE(VC) /REPEATED=SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). This code indicates that your dependent variable is ZBeck_Sum, and that you have two independent variables, one of which is categorical (Depressed) while the other is continuous (SessionNum). I have concerns about using SessionNum (which I presume means session number) as a continuous variable, but I'll look past that for now. The FIXED statement indicates that you have assigned fixed effects to Depressed SessionNum and their interaction. Your RANDOM statement and REPEATED statement seem reasonable to me. Having stated that, it is possible that a more parsiminous structure could be specified in the REPEATED statement without losing model fit (e.g. autoregressive), though the autoregressive structure assumes the intervals between adjacent sessions are equal in length. It also, of course, assumes a decaying correlation as time between sessions increases. Both REPEATED and RANDOM statements in this model would be essentially redundant under the special circumstance where the correlation between observations for each subject is equal. If that's the case, then you can probably just drop the REPEATED statement. Let's move on to the next chunk of code where you have actually dropped the REPEATED statement:
MIXED ZBeck_sum BY Depressed WITH SessionNum
/FIXED=Depressed SessionNum Depressed*SessionNum | SSTYPE(3) /METHOD=REML /PRINT=COVB R SOLUTION TESTCOV /RANDOM=INTERCEPT SessionNum | SUBJECT(Subj_ID) COVTYPE(UN). In this code, everything appears to be the same except that you have removed the REPEATED statement and added in a random SessionNum slope. Note that you are also allowing for the random effects (intercepts and slopes) to be correlated. You are specifying that SessionNum has both a fixed and random effect. To your question about the difference between the FIXED and REPEATED statement. The FIXED statement specifies the fixed effects while your REPEATED statement specifies the structure of the residual variance. If you treat a variable as having a fixed effect, then you are assuming that the coefficient associated with this fixed effect are *not* random values (they are fixed). If you were treat a variable as having a random effect, then the coefficient would be considered random. It is certainly possible to assume that a variable has both a fixed effect and random effect within the same model, as you have done with SessionNum.
Recall that the model variance-covariance matrix is made up of both the random effects variance-covariance matrix (G-matrix) and the residual variance-covariance matrix (R-matrix). Again, the REPEATED statement pertains to the residual (R) matrix, while the RANDOM statement pertains to the G matrix.
It isn't surprising to me that the fixed effects estimates are different when you run both pieces of code. You are fitting different models.
Ryan
On Thu, Jun 10, 2010 at 2:01 PM, Karen Harker <[hidden email]> wrote:
|
|
Administrator
|
Karen, if the time interval between sessions is always one week, you are better off treating it as a continuous variable IMO. Treating it as categorical eats up too many degrees of freedom needlessly. Furthermore, there is no requirement that Session or any other continuous explanatory variable be normally distributed. In regression models, it is the "errors" (as estimated by the residuals) that are assumed to be (approximately) normal.
If the time interval is not constant, you can still model time as a continuous variable; but you would need the actual time (from baseline) for each session, not session number. HTH. Bruce p.s. - For more on the distinction between "errors" and "residuals", see the Wikipedia page: http://en.wikipedia.org/wiki/Errors_and_residuals_in_statistics
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
| Free forum by Nabble | Edit this page |
