Estimation maximisation - How to preserve missing values designated as "99"

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Estimation maximisation - How to preserve missing values designated as "99"

ryanf1
I have a large dataset containing longitudinal data and I have
designated "dropouts" using the discreet missing value label "99".  I
want to use expectation maximisation to impute other missing value
data - these cells are empty at present.  However, when I run the
process, I find that all missing values have been imputed (including
those labelled "99").  How can I prevent this?

Thanks,
Margo
Reply | Threaded
Open this post in threaded view
|

Re: Estimation maximisation - How to preserve missing values designated as "99"

Bruce Weaver
Administrator
If I follow, one way would be to not treat 99 as missing while you are doing the imputation.  But the problem with that, I think, is that the 99s would then be used in the imputation models, and I doubt you want that.  

How about this?

1. After you have done the imputations, restructure the data file that contains the original data and the imputed data sets from LONG to WIDE (via CASESTOVARS).

2. Loop through the variables from the imputed data sets, changing the value to 99 whenever there is a 99 in the original data set.

3. Restructure from WIDE to LONG to get back to the file format you need.

If this doesn't do what you want, it might at least inspire someone to propose a better solution.  ;-)



ryanf1 wrote
I have a large dataset containing longitudinal data and I have
designated "dropouts" using the discreet missing value label "99".  I
want to use expectation maximisation to impute other missing value
data - these cells are empty at present.  However, when I run the
process, I find that all missing values have been imputed (including
those labelled "99").  How can I prevent this?

Thanks,
Margo
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Estimation maximisation - How to preserve missing values designated as "99"

David Marso
Administrator
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Estimation maximisation - How to preserve missing values designated as "99"

David Marso
Administrator
In reply to this post by Bruce Weaver
In a similar way of thinking but different.
a.  Definitely would NOT want the 99's to be used in the imputation.
Rather than restructure W->L, iterate and L->W,
I suspect UPDATE might be another way to go and is intrinsically a simpler approach.
*Create Unique ID in File.
COMPUTE Unique_ID=$CASENUM.
* Create a duplicate data set *.
SAVE OUTFILE 'datacopy.sav'.
* This needs TLC if there are variables with ANY legitimate 99 which are
* NOT missing values, also adapt for string variables etc*.
* Turn off missing values * .
MISSING VALUES ALL ().
* NUKE everything except the 99's from the UPDATE transaction file ;-).
RECODE ALL (99=99)(ELSE=SYSMIS).
* Save the '99 file' for UPDATE transaction.
SAVE OUTFILE 'datacopy99.sav'.
GET FILE 'datacopy.sav'.
RMV blah blah...
UPDATE FILE * / FILE 'datacopy99.sav' / BY Unique_ID .
MISSING VALUES ALL (99).
You can adapt this to use the newer multiple data set stuff in
recent versions (Leave that to you as I don't have a version which supports this).
I have not tested this as I am not on an OS with SPSS but it should do the trick.
HTH, David

Bruce Weaver wrote
If I follow, one way would be to not treat 99 as missing while you are doing the imputation.  But the problem with that, I think, is that the 99s would then be used in the imputation models, and I doubt you want that.  

How about this?

1. After you have done the imputations, restructure the data file that contains the original data and the imputed data sets from LONG to WIDE (via CASESTOVARS).

2. Loop through the variables from the imputed data sets, changing the value to 99 whenever there is a 99 in the original data set.

3. Restructure from WIDE to LONG to get back to the file format you need.

If this doesn't do what you want, it might at least inspire someone to propose a better solution.  ;-)



ryanf1 wrote
I have a large dataset containing longitudinal data and I have
designated "dropouts" using the discreet missing value label "99".  I
want to use expectation maximisation to impute other missing value
data - these cells are empty at present.  However, when I run the
process, I find that all missing values have been imputed (including
those labelled "99").  How can I prevent this?

Thanks,
Margo
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Estimation maximisation - How to preserve missing values designated as "99"

ryanf1
Thanks Bruce and David. Bruce, I went with your more labour intensive method, since I do not have enough knowledge of syntax to modify David's code.
All is well now - Margaret

On 29 June 2011 23:00, David Marso [via SPSSX Discussion] <[hidden email]> wrote:
In a similar way of thinking but different.
a.  Definitely would NOT want the 99's to be used in the imputation.
Rather than restructure W->L, iterate and L->W,
I suspect UPDATE might be another way to go and is intrinsically a simpler approach.
*Create Unique ID in File.
COMPUTE Unique_ID=$CASENUM.
* Create a duplicate data set *.
SAVE OUTFILE 'datacopy.sav'.
* This needs TLC if there are variables with ANY legitimate 99 which are
* NOT missing values, also adapt for string variables etc*.
* Turn off missing values * .
MISSING VALUES ALL ().
* NUKE everything except the 99's from the UPDATE transaction file ;-).
RECODE ALL (99=99)(ELSE=SYSMIS).
* Save the '99 file' for UPDATE transaction.
SAVE OUTFILE 'datacopy99.sav'.
GET FILE 'datacopy.sav'.
RMV blah blah...
UPDATE FILE * / FILE 'datacopy99.sav' / BY Unique_ID .
MISSING VALUES ALL (99).
You can adapt this to use the newer multiple data set stuff in
recent versions (Leave that to you as I don't have a version which supports this).
I have not tested this as I am not on an OS with SPSS but it should do the trick.
HTH, David

Bruce Weaver wrote:
If I follow, one way would be to not treat 99 as missing while you are doing the imputation.  But the problem with that, I think, is that the 99s would then be used in the imputation models, and I doubt you want that.  

How about this?

1. After you have done the imputations, restructure the data file that contains the original data and the imputed data sets from LONG to WIDE (via CASESTOVARS).

2. Loop through the variables from the imputed data sets, changing the value to 99 whenever there is a 99 in the original data set.

3. Restructure from WIDE to LONG to get back to the file format you need.

If this doesn't do what you want, it might at least inspire someone to propose a better solution.  ;-)



ryanf1 wrote:
I have a large dataset containing longitudinal data and I have
designated "dropouts" using the discreet missing value label "99".  I
want to use expectation maximisation to impute other missing value
data - these cells are empty at present.  However, when I run the
process, I find that all missing values have been imputed (including
those labelled "99").  How can I prevent this?

Thanks,
Margo



To unsubscribe from Estimation maximisation - How to preserve missing values designated as "99", click here.



--
Margaret Ryan,
School of Psychology,
Trinity College Dublin.
Tel: Office   8963083  / 8963913
Mobile:  087 2818090