Options for Dealing with Missing Data in Mixed ANOVA

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Options for Dealing with Missing Data in Mixed ANOVA

Ecushla
Hi,

I am looking to assess the effect of an intervention (vs control) on a number of outcomes at 4-time points. I have one between-groups IV (group) and one within-groups IV (time) and many continuous outcomes (DV) and am therefore looking to do a Mixed ANOVA.

I know a linear mixed effects model is better at dealing with missing data, but I think this is beyond my ability at this point in time.  I have also read a lot that suggests carrying forward the last observation is never a good idea and can introduce more bias. I am going to check to make sure the missing data points are not correlated to demographic variables as well.

Is carrying forward the last observation (LOCF) for missing time points suitable in mixed ANOVA or is there another way I could be dealing with this - other than perhaps running the mixed ANOVA on both the original dataset and the LOCF dataset?

Thank you for your help
Brooklyn
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Joost van Ginkel
Dear Brooklyn,
 
The best way to deal with missing data is Multiple imputation (in SPSS: Analyze, Multiple Imputation, Impute Missing Data Values). The procedure creates several plausible complete versions of the incomplete dataset. Each of these complete versions are then analyzed with the statistical analysis of interest (a Mixed ANOVA in your case), yielding several different outcomes of your analysis. In the final step these analyses are pooled into one overall analysis. If you analyze a multiply imputed dataset in SPSS, SPSS automatically recognizes this as a multiple imputation dataset and pools the results of the analyses for you, that is, for some statistical procedures. For ANOVA though, SPSS does no pooling, unfortunately. However, the following paper deals with this problem:
 
Van Ginkel, J.R. & Kroonenberg, P.M (2014). Analysis of variance of multiply imputed data. Multivariate Behavioral Research, 49, 78–91.
 
The paper comes with an SPSS macro, called MI-MUL2.sps, which can be downloaded from:
 
 
If you need any help with this, please feel free to contact me.
 
Best regards,
 
Joost van Ginkel
 
-----Original Message-----
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Brooklyn
Sent: Tuesday, February 21, 2017 4:53 AM
To: [hidden email]
Subject: Options for Dealing with Missing Data in Mixed ANOVA
 
Hi,
 
I am looking to assess the effect of an intervention (vs control) on a number of outcomes at 4-time points. I have one between-groups IV (group) and one within-groups IV (time) and many continuous outcomes (DV) and am therefore looking to do a Mixed ANOVA.
 
I know a linear mixed effects model is better at dealing with missing data, but I think this is beyond my ability at this point in time.  I have also read a lot that suggests carrying forward the last observation is never a good idea and can introduce more bias. I am going to check to make sure the missing data points are not correlated to demographic variables as well.
 
Is carrying forward the last observation (LOCF) for missing time points suitable in mixed ANOVA or is there another way I could be dealing with this
- other than perhaps running the mixed ANOVA on both the original dataset and the LOCF dataset?
 
Thank you for your help
Brooklyn
 
 
 
--
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
 
=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
 
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Rich Ulrich
In reply to this post by Ecushla

What you /properly/ do about missing data has to take into account several factors, with

the easiest being, "Why is it missing?" and "How much are we talking about?" (compared to

how much total data).  In a design with four time points, you /might/ be talking -- partly or

entirely -- about people who dropped out ... which could be "at random" or due to "treatment

failure."   Also, "Do Intervention and Control show the same amount and have the same reasons?"


  - A small amount of "missing", distributed equally between Intervention and Control,

seemingly "missing at random", is the easy case to cope with, since the choice of solution will

not make much difference. 


 - Carrying forward the last observation is only reasonable when that score /does/ represent

what you would expect their status to be at that time.  And when there is not very much of it. 


Hope this helps.

--

Rich Ulrich


From: SPSSX(r) Discussion <[hidden email]> on behalf of Brooklyn <[hidden email]>
Sent: Monday, February 20, 2017 10:52:38 PM
To: [hidden email]
Subject: Options for Dealing with Missing Data in Mixed ANOVA
 
Hi,

I am looking to assess the effect of an intervention (vs control) on a
number of outcomes at 4-time points. I have one between-groups IV (group)
and one within-groups IV (time) and many continuous outcomes (DV) and am
therefore looking to do a Mixed ANOVA.

I know a linear mixed effects model is better at dealing with missing data,
but I think this is beyond my ability at this point in time.  I have also
read a lot that suggests carrying forward the last observation is never a
good idea and can introduce more bias. I am going to check to make sure the
missing data points are not correlated to demographic variables as well.

Is carrying forward the last observation (LOCF) for missing time points
suitable in mixed ANOVA or is there another way I could be dealing with this
- other than perhaps running the mixed ANOVA on both the original dataset
and the LOCF dataset?

Thank you for your help
Brooklyn



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Options-for-Dealing-with-Missing-Data-in-Mixed-ANOVA-tp5733847.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Bruce Weaver
Administrator
In reply to this post by Ecushla
You say that using the MIXED procedure is beyond your ability at this time.  I think you are giving up far too easily on that approach.  It sounds like your analysis is structurally identical to one of the examples on this UCLA page:

  http://www.ats.ucla.edu/stat/spss/seminars/Repeated_Measures/

Scroll down to "Exercise example, model 2 using MIXED Command".  In that example, Exercise Type is a between-Ss factor and Time a repeated measures factor.  So you have a good model of what your MIXED syntax should look like.  

Re LOCF, you might find David Streiner's short note on it useful.

  http://ebmh.bmj.com/content/11/1/3.2

HTH.


Brooklyn wrote
Hi,

I am looking to assess the effect of an intervention (vs control) on a number of outcomes at 4-time points. I have one between-groups IV (group) and one within-groups IV (time) and many continuous outcomes (DV) and am therefore looking to do a Mixed ANOVA.

I know a linear mixed effects model is better at dealing with missing data, but I think this is beyond my ability at this point in time.  I have also read a lot that suggests carrying forward the last observation is never a good idea and can introduce more bias. I am going to check to make sure the missing data points are not correlated to demographic variables as well.

Is carrying forward the last observation (LOCF) for missing time points suitable in mixed ANOVA or is there another way I could be dealing with this - other than perhaps running the mixed ANOVA on both the original dataset and the LOCF dataset?

Thank you for your help
Brooklyn
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Ecushla
In reply to this post by Rich Ulrich
Rich Ulrich wrote
What you /properly/ do about missing data has to take into account several factors, with

the easiest being, "Why is it missing?" and "How much are we talking about?" (compared to

how much total data).  

 - Carrying forward the last observation is only reasonable when that score /does/ represent

what you would expect their status to be at that time.  And when there is not very much of it.
Thanks Rich, I am looking into the reasons for the missing data at the moment.  Most is loss of patient contact, and I am unable to get a reason, however there are a couple of cases where the reason for dropout is that they were not happy with the intervention.

I think you have confirmed that carrying forward the last observation is not the best approach.
Kind regards
Ecushla
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Ecushla
In reply to this post by Bruce Weaver
Bruce Weaver wrote
You say that using the MIXED procedure is beyond your ability at this time.  I think you are giving up far too easily on that approach.  It sounds like your analysis is structurally identical to one of the examples on this UCLA page:

  http://www.ats.ucla.edu/stat/spss/seminars/Repeated_Measures/
Thanks Bruce,

The Uni statistician strongly advised against me trying to learn the linear mixed effects model without attending a proper course or having a solid regression knowledge. It sounds like it is the best statistical approach though. I might look into some online videos etc and see if I can work it out!  Thanks for the heads up that it might be doable!

I thought the mixed effects model handles the missing data by using all available data (i.e. not dropping cases) and therefore I would not need to 'add/substitute/impute missing data'. However I was told
" While a linear mixed effects model handles missing data better than other statistical methods, it cannot be used as a substitute for imputation. With longitudinal data, you would first carry the last observation forward and then reshape the data (into long format) and run your mixed models" which seems contradictory.  Am I missing a basic point?
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Bruce Weaver
Administrator
MIXED requires a long file structure (i.e., one row per repeated measure, so multiple rows per ID), and does indeed use all available data for each subject.  If I remember correctly, it assumes that the missing data are at worst "missing at random" (MAR).

In my view, LOCF is a very poor way to deal with missing data--see the Streiner article I posted earlier.  If you are going to use some kind of imputation, use multiple imputation instead.  

HTH.


Ecushla wrote
Bruce Weaver wrote
You say that using the MIXED procedure is beyond your ability at this time.  I think you are giving up far too easily on that approach.  It sounds like your analysis is structurally identical to one of the examples on this UCLA page:

  http://www.ats.ucla.edu/stat/spss/seminars/Repeated_Measures/
Thanks Bruce,

The Uni statistician strongly advised against me trying to learn the linear mixed effects model without attending a proper course or having a solid regression knowledge. It sounds like it is the best statistical approach though. I might look into some online videos etc and see if I can work it out!  Thanks for the heads up that it might be doable!

I thought the mixed effects model handles the missing data by using all available data (i.e. not dropping cases) and therefore I would not need to 'add/substitute/impute missing data'. However I was told
" While a linear mixed effects model handles missing data better than other statistical methods, it cannot be used as a substitute for imputation. With longitudinal data, you would first carry the last observation forward and then reshape the data (into long format) and run your mixed models" which seems contradictory.  Am I missing a basic point?
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Ecushla
Yes, I have come to the conclusion NOT to do LOCF. 

So am I correct in my understanding that I could 

1) do mixed ANOVA with some sort of multiple imputation (to input a value for the missing data, as per J.R. Van Ginkels post and publication) 

OR

2) run the linear mixed effects model on my raw data (I do not need to do any imputation to account for the missing data, the mixed effects model deals with this).

Both are suitable options?



On Wed, Feb 22, 2017 at 1:09 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote:
MIXED requires a long file structure (i.e., one row per repeated measure, so multiple rows per ID), and does indeed use all available data for each subject.  If I remember correctly, it assumes that the missing data are at worst "missing at random" (MAR).

In my view, LOCF is a very poor way to deal with missing data--see the Streiner article I posted earlier.  If you are going to use some kind of imputation, use multiple imputation instead.  

HTH.


Ecushla wrote
Bruce Weaver wrote
You say that using the MIXED procedure is beyond your ability at this time.  I think you are giving up far too easily on that approach.  It sounds like your analysis is structurally identical to one of the examples on this UCLA page:

  http://www.ats.ucla.edu/stat/spss/seminars/Repeated_Measures/
Thanks Bruce,

The Uni statistician strongly advised against me trying to learn the linear mixed effects model without attending a proper course or having a solid regression knowledge. It sounds like it is the best statistical approach though. I might look into some online videos etc and see if I can work it out!  Thanks for the heads up that it might be doable!

I thought the mixed effects model handles the missing data by using all available data (i.e. not dropping cases) and therefore I would not need to 'add/substitute/impute missing data'. However I was told
" While a linear mixed effects model handles missing data better than other statistical methods, it cannot be used as a substitute for imputation. With longitudinal data, you would first carry the last observation forward and then reshape the data (into long format) and run your mixed models" which seems contradictory.  Am I missing a basic point?
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Options-for-Dealing-with-Missing-Data-in-Mixed-ANOVA-tp5733847p5733863.html
To unsubscribe from Options for Dealing with Missing Data in Mixed ANOVA, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Rich Ulrich
In reply to this post by Ecushla


Oh, I prefer that my answer be read as, "There is no answer where
one-size-fits-all."  I don't know enough about your data to know that
it is not best for your data.

By the way, as to another question:  Your respondent does not seem
to be distinguishing clearly between "mixed model" and "SPSS MIXED"
procedure.  Fixed effects.  Random effects.  Some of each gives you
Mixed model.  An analysis that ignores Missing is an "unbalanced
blocks" design -- to give one technical name for it.

I have had data where there was varying lengths of followup where
I found it instructive to look separately at cases according to how
many followups existed -- this was out of a dozen periods.  We
noticed, for instance, that cases who Relapsed (thus, "dropping out")
showed an increase of symptoms at the both the last period and the
one just before it.  That was true for both drug-treated and control.
 - That would be impossible to spot with any analysis I've seen
mentioned in this discussion... which has generally assumed few
dropouts.  And you have only four assessments.

I will repeat something I mentioned before:  "How much Missing is
there?"  If there is much of it, you will not have a satisfactory
analysis until you do quite a bit to characterize the Missing; it is
possible that the Missing could rule out any definitive conclusion
/despite/ however-large the apparent effects may be. ("Really effective
treatment of the target-effects by the end, but half the Treated group
withdrew owing to serious side effects.")

--
Rich Ulrich


From: SPSSX(r) Discussion <[hidden email]> on behalf of Ecushla <[hidden email]>
Sent: Tuesday, February 21, 2017 9:07 PM
To: [hidden email]
Subject: Re: Options for Dealing with Missing Data in Mixed ANOVA
 
Rich Ulrich wrote
> What you /properly/ do about missing data has to take into account several
> factors, with
>
> the easiest being, "Why is it missing?" and "How much are we talking
> about?" (compared to
>
> how much total data). 
>
>  - Carrying forward the last observation is only reasonable when that
> score /does/ represent
>
> what you would expect their status to be at that time.  And when there is
> not very much of it.

Thanks Rich, I am looking into the reasons for the missing data at the
moment.  Most is loss of patient contact, and I am unable to get a reason,
however there are a couple of cases where the reason for dropout is that
they were not happy with the intervention.

I think you have confirmed that carrying forward the last observation is not
the best approach.
Kind regards
Ecushla

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Joost van Ginkel
In reply to this post by Bruce Weaver
I don't know the specific examples, but an advantage of Multiple imputation over Multilevel/MIXED is that Multiple imputation can also take into account background variables in handling the missing data, which don't have to be included in the subsequent analysis. My 2014 paper also addresses this issue. By the way, should you want to do a repeated measures ANOVA on multiply imputed datasets you end up in the MIXED procedure anyway because the only way to pool the results of repeated measures ANOVA using my SPSS macro is to carry out the analysis in MIXED. The reason is that GLM doesn't provide covariance matrices of parameter estimates, which are necessary for pooling the results.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver
Sent: Tuesday, February 21, 2017 7:12 PM
To: [hidden email]
Subject: Re: Options for Dealing with Missing Data in Mixed ANOVA

You say that using the MIXED procedure is beyond your ability at this time.
I think you are giving up far too easily on that approach.  It sounds like your analysis is structurally identical to one of the examples on this UCLA
page:

  http://www.ats.ucla.edu/stat/spss/seminars/Repeated_Measures/

Scroll down to "Exercise example, model 2 using MIXED Command".  In that example, Exercise Type is a between-Ss factor and Time a repeated measures factor.  So you have a good model of what your MIXED syntax should look like.  

Re LOCF, you might find David Streiner's short note on it useful.

  http://ebmh.bmj.com/content/11/1/3.2

HTH.



Brooklyn wrote

> Hi,
>
> I am looking to assess the effect of an intervention (vs control) on a
> number of outcomes at 4-time points. I have one between-groups IV
> (group) and one within-groups IV (time) and many continuous outcomes
> (DV) and am therefore looking to do a Mixed ANOVA.
>
> I know a linear mixed effects model is better at dealing with missing
> data, but I think this is beyond my ability at this point in time.  I
> have also read a lot that suggests carrying forward the last
> observation is never a good idea and can introduce more bias. I am
> going to check to make sure the missing data points are not correlated
> to demographic variables as well.
>
> Is carrying forward the last observation (LOCF) for missing time
> points suitable in mixed ANOVA or is there another way I could be
> dealing with this - other than perhaps running the mixed ANOVA on both
> the original dataset and the LOCF dataset?
>
> Thank you for your help
> Brooklyn





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Options-for-Dealing-with-Missing-Data-in-Mixed-ANOVA-tp5733847p5733855.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Options for Dealing with Missing Data in Mixed ANOVA

Ryan
In reply to this post by Ecushla
While it is true that the way the data set is structured to perform a linear mixed model (LMM) allows one to use "all possible data", that is certainly not the only reason to use LMM. Moreover, linear mixed modeling can be used in conjunction with multiple imputation of missing data. 

Generally speaking, LMM has more flexibility in the specification of random and residual effects structures than general linear models. It is also worth noting that LMM estimates parameters using maximum likelihood methods. 

Ryan

On Tue, Feb 21, 2017 at 9:54 PM, Ecushla <[hidden email]> wrote:
Yes, I have come to the conclusion NOT to do LOCF. 

So am I correct in my understanding that I could 

1) do mixed ANOVA with some sort of multiple imputation (to input a value for the missing data, as per J.R. Van Ginkels post and publication) 

OR

2) run the linear mixed effects model on my raw data (I do not need to do any imputation to account for the missing data, the mixed effects model deals with this).

Both are suitable options?



On Wed, Feb 22, 2017 at 1:09 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote:
MIXED requires a long file structure (i.e., one row per repeated measure, so multiple rows per ID), and does indeed use all available data for each subject.  If I remember correctly, it assumes that the missing data are at worst "missing at random" (MAR).

In my view, LOCF is a very poor way to deal with missing data--see the Streiner article I posted earlier.  If you are going to use some kind of imputation, use multiple imputation instead.  

HTH.


Ecushla wrote
Bruce Weaver wrote
You say that using the MIXED procedure is beyond your ability at this time.  I think you are giving up far too easily on that approach.  It sounds like your analysis is structurally identical to one of the examples on this UCLA page:

  http://www.ats.ucla.edu/stat/spss/seminars/Repeated_Measures/
Thanks Bruce,

The Uni statistician strongly advised against me trying to learn the linear mixed effects model without attending a proper course or having a solid regression knowledge. It sounds like it is the best statistical approach though. I might look into some online videos etc and see if I can work it out!  Thanks for the heads up that it might be doable!

I thought the mixed effects model handles the missing data by using all available data (i.e. not dropping cases) and therefore I would not need to 'add/substitute/impute missing data'. However I was told
" While a linear mixed effects model handles missing data better than other statistical methods, it cannot be used as a substitute for imputation. With longitudinal data, you would first carry the last observation forward and then reshape the data (into long format) and run your mixed models" which seems contradictory.  Am I missing a basic point?
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Options-for-Dealing-with-Missing-Data-in-Mixed-ANOVA-tp5733847p5733863.html
To unsubscribe from Options for Dealing with Missing Data in Mixed ANOVA, click here.
NAML



View this message in context: Re: Options for Dealing with Missing Data in Mixed ANOVA

Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD