SPSSX Discussion - Re: Missing values in MIXED

Re: Missing values in MIXED

Posted by Ryan on Mar 17, 2013; 12:40pm
URL: http://spssx-discussion.165.s1.nabble.com/Missing-values-in-MIXED-tp5718714p5718751.html

Diana,

See my comments below.

On Mar 17, 2013, at 4:43 AM, "Kornbrot, Diana" <[hidden email]> wrote:

Re: Missing values in MIXED Ryan

Thanks
Done all that. Converting horizontal to vertical is straightforward using the data structuring wizard [don’t need syntax], once one gets the hang of it

My ACTUAL question was:
MIXED with data in long form can cope with missing data, with correction for denominator df
GLM REPEATED insists on NO missing data
So what is the difference?

With the help of Bruce Weaver, I have NOW worked out that the difference lies in the covariance matrix used for estimation of parameters
REPEATED applies list wise deletion and so discards any subjects that do not have values for all variables,
MIXED applies pair wise deletion.

That is exactly what I showed in the illustration.

Suspect the reduced df is harmonic mean of df for relevant groups, but do not know

No need to suspect. I provided a link to the formula for df error. I don't know what you mean by reduced.

Bruce provides following useful refs that suggest that using MIXED may actually be less biased than any of a whole slew of complicated imputation procedures:
Twisk & de Vente (2002): http://europepmc.org/abstract/MED/11927199
Twisk (2003):
http://books.google.ca/books?hl=en&lr=&id=TCg02e-tI_cC&oi=fnd&pg=PR15&dq=Twisk+2003&ots=2GfodRIiu9&sig=z8BSBQoRaZNavIzj_QOeATBP_nw#v=onepage&q=Twisk%202003&f=false
Singer & Willett (/Applied Longitudinal Data Analysis/, Chapter 5).

I NOW recommend MIXED with UNSTRUCTURED covariance matrix across the board.

That is a poor recommendation. The goal should be to find the optimal residual variance-covariance structure. You could reduce statistical power if you employ an unstructured matrix if there is a less restrictive structure that fits that data equally well (e.g., AR1. TOEP). There may be other aspects to your data as well (G-side random effects that should be incorporated).

No doubt it will take time to ‘filter down’ to all users
Output much simpler as all inferential tests in 1 table
Can do appropriate post hoc or planned comparisons with standard errors correctly estimated from unstructured covariance matrix.

That is not only true for the unstructured matrix.

MIXED has limitation of not supplying effect sizes.
Jason Becksted points out that on can calculate partial eta squared = F*df1/(F*df1+df2), where df1 is the hypothesis df and df2 is the error df.

So did I, publicly, when you asked. And I pointed out that one would have to employ ML to use that same formula to obtain partial eta squared from a fully balanced fixed effects only design. But, I would question the validity of using that formula under all circumstances, which is why I provided the alternative. For example, what if you are trying to determine the effect size of a random effect? What if your fixed effect predictor is at a higher level? There have been plenty of discussions on this matter on the multilevel listserve and in multilevel textbooks. I would not simply apply that formula to all circumstances. In fact, I would generally recommend using the second approach I showed.

REPEATED, no doubt ground breaking in its time [distant past], is fiddly & potentially misleading. Although the multivariate option uses correct unstructured covariance matrix, the post hocs use SEs based on inappropriate diagnonal covraince matrix, with GG corrections. Personally, have never seen a covariance matrix with all pair wise covariances equal – seems improbable in the real world.

Again, there are alternatives to both extremes. It is not one versus the other.

Best

Diana

On 16/03/2013 16:49, "R B" <ryan.andrew.black@...> wrote:

Diana,

In order to employ a linear mixed model in SPSS, one must construct the dataset in vertical format, such that there are "k" cases per subject with an identification variable with non-repeating numbers for cases associated with a particular subject. Assuming the within-subjects variable is either nominal, ordinal, or is composed of equally-spaced intervals, it is common practice for the within-subjects variable to be a numeric integer variable with sequential values from 1 through "k" levels of the within-subjects variable. Finally, the response variable must be concatenated vertically with each measurement linked to the appropriate ID and level of the within-subject variable.

Here is an illustration:

ID Time y
1   1    34
1   2    22
1   3    12
1   4    11
2   1    33
2   2    32
2   3    .
2   4    22
3   1    38
3   2    37
3   3    34
3   4    30
.
.
.
.

As you can see above, the second subject was not measured at time 3. As a result, that case will be excluded from the linear mixed model analysis. However, data obtained from other times points for that particular subject will be included in the analysis. The assumption we must make in order to obtain unbiased estimates derived from a linear mixed model is that the data are missing randomly. With that said, the MIXED procedure in SPSS calculates degrees of freedom using Satterthwaite's Approximation:

http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Falg_mixed_custom-tests_satterthwaite.htm

This approximation has been shown to be valid for balanced and unbalanced designs.

In addition to the benefits of not having to exclude all data from subjects who happen to have data which are missing randomly for parameter estimation, the MIXED procedure allows for modeling of continuous response variables using various hierarchical designs and residual covariance structures.

Ryan
On Fri, Mar 15, 2013 at 11:46 AM, Kornbrot, Diana <d.e.kornbrot@...> wrote:

If one uses repeated in procedure GLM then it appears that all subjects must have vlaues for all combinations of the rpeated measures
BUT using MIXED, there is then a non-integer error df
How is SPSS actually handling the missing values?
Nb Am using unstructured covariance matrix

Thanks for help
Best
Diana

Emeritus Professor Diana Kornbrot
email: d.e.kornbrot@... <http://d.e.kornbrot@...>
web:    http://dianakornbrot.wordpress.com/
Work
Department of Psychology
School of Life and Medical Sciences
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:   +44 (0) 170 728 4626 <tel:%2B44%20%280%29%20170%20728%204626>
Home
19 Elmhurst Avenue
London N2 0LT, UK
voice:   +44 (0) 208 444 2081 <tel:%2B44%20%280%29%20208%20%C2%A0444%202081>
mobile: +44 (0) 740 318 1612 <tel:%2B44%20%280%29%20740%20318%201612>

Emeritus Professor Diana Kornbrot
email: d.e.kornbrot@...
web:    http://dianakornbrot.wordpress.com/
Work
Department of Psychology
School of Life and Medical Sciences
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:   +44 (0) 170 728 4626
Home
19 Elmhurst Avenue
London N2 0LT, UK
voice:   +44 (0) 208 444 2081
mobile: +44 (0) 740 318 1612