SPSSX Discussion

Problem with Linear Mixed in Nested QED

Classic

List

Threaded

2 messages Options

William Dudley-2

Problem with Linear Mixed in Nested QED

Hello

I need some help from Multilevel Modelers out there.

Apologies in advance for this lengthy post.

I am tying to model outcomes in a quasi experimental design in which schools (SchoolID) are assigned to Treatment or Control.

Students (RIC) are nested within schools and all students in a given school receive the same level of the treatment (or absence thereof).

The data collection design is a bit complex in that we have both 11th and 12th graders and data come from two years of the study (study year 2 and study year 3 ).

Some 12th graders are in the study just in year 2 (Cell 2A) and some 11th graders are in the study just in year 3 (Cell 3C).

However, most 11th graders in year 2 (Cell 2B) are then 12th graders in year (Cell 3B).

Cell 2A (year 2)

12th graders

In study for

one year

Cell 2B (year 2)

11th Graders

Who May have advanced

to 12th Grade

Cell 3B (year 3)

12 graders

Many of whom were in

the previous year

(Cell 2B)

Cell 3 C (Year 3)

11th graders

Who may be 12th graders

next year

There are about 6,000 observations in each Cell and all but about 1000 students in Cell 2B advance to Cell 3B.

I want to capture all of the data (one option is to jettison Cell 2B or at least the students in 2B who are also in 3B) .

Total number of observations is about 24,000 total number of students is about 18,000.

I have three dummy codes

TX (Treatment = 1 Control = 0)

Year_3 (Year 3 = 1 Year 2 = 0)

Grade_12 (Grade 12 = 1 Grade 11 = 0).

I have tried to model this (excluding any covariates ) to examine accrual of college credits as follows:

MODEL 1 This model produces a warning that the residual term for RIC*SchoolID is redundant and cannot be estimated

MIXED collcred WITH Tx Year_3 grade_12

/FIXED= tx Year_3 grade_12

/PRINT = SOLUTION TESTCOV

/RANDOM = INTERCEPT | SUBJECT(RIC)

/RANDOM = INTERCEPT | SUBJECT(RIC*Schoolid)

/METHOD=REML.

MODEL 2 Crashes and gives and error saying that memory has been exceeded.

MIXED collcred WITH Tx Year_3 grade_12

/FIXED= tx Year_3 grade_12

/PRINT = SOLUTION TESTCOV

/RANDOM = INTERCEPT | SUBJECT(Schoolid)

/RANDOM = INTERCEPT | SUBJECT(RIC *Schoolid)

/METHOD=REML.

Model 3 This works in that the model runs without warnings and the estimates are reasonable, the students are nested within school (which is critical)

BUT it fails to take into account the dependency of students then Cells 2B and 3B .

MIXED collcred WITH Tx Year_3 grade_12

/FIXED= tx Year_3 grade_12

/PRINT = SOLUTION TESTCOV

/RANDOM = INTERCEPT | SUBJECT(Schoolid)

/METHOD=REML.

One option is to go with Model 3 BUT I'd rather NOT ignore the dependency in observations between Cells 2B and 3B

Other options include ignore Cell 2B - capturing only those in Cell 2B who are not in Cell 3B, ....

Any thoughts would be greatly appreciated.

Bill

William N. Dudley, PhD
Professor - Public Health Education
The School of Health and Human Sciences
The University of North Carolina at Greensboro

437-L Coleman Building

Greensboro, NC 27402-6170
See my research on

GoogleScholar

ResearchGate
VOICE 336.256 2475

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Ryan

Re: Problem with Linear Mixed in Nested QED

Bill,

The two RANDOM statement are redundant in model 1. The first statement is estimating student random effects, and the second statement is also specifying student random effects. (I assume student ID is not nested in school ID. In other words, there are unique values for student ID across school IDs.)

Model 2 is correctly specifying student and school random effects. You might be able to reduce the computational burden necessary to fit model 2 by having student IDs nested within school IDs. In other words, the student IDs should be integers starting at 1 through n number of students in school 1, the student IDs should be integers starting at 1 through n number of students in school 2, and so on.

If you have to choose one random effect term to account for, it’s generally preferable to account for the lower level random effect term (student) rather than the upper level (school).

Ryan

Sent from my iPhone

On Jun 26, 2018, at 8:53 AM, William Dudley <[hidden email]> wrote:

Hello

I need some help from Multilevel Modelers out there.

Apologies in advance for this lengthy post.

I am tying to model outcomes in a quasi experimental design in which schools (SchoolID) are assigned to Treatment or Control.
Students (RIC) are nested within schools and all students in a given school receive the same level of the treatment (or absence thereof).

The data collection design is a bit complex in that we have both 11th and 12th graders and data come from two years of the study (study year 2 and study year 3 ).

Some 12th graders are in the study just in year 2 (Cell 2A) and some 11th graders are in the study just in year 3 (Cell 3C).
However, most 11th graders in year 2 (Cell 2B) are then 12th graders in year (Cell 3B).

Cell 2A (year 2)
12th graders
In study for
one year

Cell 2B (year 2)
11th Graders
Who May have advanced
to 12th Grade

Cell 3B (year 3)
12 graders
Many of whom were in
the previous year
(Cell 2B)

Cell 3 C (Year 3)
11th graders
Who may be 12th graders
next year

There are about 6,000 observations in each Cell and all but about 1000 students in Cell 2B advance to Cell 3B.
I want to capture all of the data (one option is to jettison Cell 2B or at least the students in 2B who are also in 3B) .
Total number of observations is about 24,000 total number of students is about 18,000.

I have three dummy codes
TX (Treatment = 1 Control = 0)
Year_3 (Year 3 = 1 Year 2 = 0)
Grade_12 (Grade 12 = 1 Grade 11 = 0).

I have tried to model this (excluding any covariates ) to examine accrual of college credits as follows:

MODEL 1 This model produces a warning that the residual term for RIC*SchoolID is redundant and cannot be estimated

MIXED collcred WITH Tx Year_3 grade_12
/FIXED= tx Year_3 grade_12
/PRINT = SOLUTION TESTCOV
/RANDOM = INTERCEPT | SUBJECT(RIC)
/RANDOM = INTERCEPT | SUBJECT(RIC*Schoolid)
/METHOD=REML.

MODEL 2 Crashes and gives and error saying that memory has been exceeded.

MIXED collcred WITH Tx Year_3 grade_12
/FIXED= tx Year_3 grade_12
/PRINT = SOLUTION TESTCOV
/RANDOM = INTERCEPT | SUBJECT(Schoolid)
/RANDOM = INTERCEPT | SUBJECT(RIC *Schoolid)
/METHOD=REML.

Model 3 This works in that the model runs without warnings and the estimates are reasonable, the students are nested within school (which is critical)
BUT it fails to take into account the dependency of students then Cells 2B and 3B .

MIXED collcred WITH Tx Year_3 grade_12
/FIXED= tx Year_3 grade_12
/PRINT = SOLUTION TESTCOV
/RANDOM = INTERCEPT | SUBJECT(Schoolid)
/METHOD=REML.

One option is to go with Model 3 BUT I'd rather NOT ignore the dependency in observations between Cells 2B and 3B
Other options include ignore Cell 2B - capturing only those in Cell 2B who are not in Cell 3B, ....

Any thoughts would be greatly appreciated.

Bill

--
William N. Dudley, PhD
Professor - Public Health Education
The School of Health and Human Sciences
The University of North Carolina at Greensboro
437-L Coleman Building
Greensboro, NC 27402-6170
See my research on
GoogleScholar
ResearchGate
VOICE 336.256 2475

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD