I have a large dataset containing longitudinal data and I have
designated "dropouts" using the discreet missing value label "99". I want to use expectation maximisation to impute other missing value data - these cells are empty at present. However, when I run the process, I find that all missing values have been imputed (including those labelled "99"). How can I prevent this? Thanks, Margo |
Administrator
|
If I follow, one way would be to not treat 99 as missing while you are doing the imputation. But the problem with that, I think, is that the 99s would then be used in the imputation models, and I doubt you want that.
How about this? 1. After you have done the imputations, restructure the data file that contains the original data and the imputed data sets from LONG to WIDE (via CASESTOVARS). 2. Loop through the variables from the imputed data sets, changing the value to 99 whenever there is a 99 in the original data set. 3. Restructure from WIDE to LONG to get back to the file format you need. If this doesn't do what you want, it might at least inspire someone to propose a better solution. ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Bruce Weaver
In a similar way of thinking but different.
a. Definitely would NOT want the 99's to be used in the imputation. Rather than restructure W->L, iterate and L->W, I suspect UPDATE might be another way to go and is intrinsically a simpler approach. *Create Unique ID in File. COMPUTE Unique_ID=$CASENUM. * Create a duplicate data set *. SAVE OUTFILE 'datacopy.sav'. * This needs TLC if there are variables with ANY legitimate 99 which are * NOT missing values, also adapt for string variables etc*. * Turn off missing values * . MISSING VALUES ALL (). * NUKE everything except the 99's from the UPDATE transaction file ;-). RECODE ALL (99=99)(ELSE=SYSMIS). * Save the '99 file' for UPDATE transaction. SAVE OUTFILE 'datacopy99.sav'. GET FILE 'datacopy.sav'. RMV blah blah... UPDATE FILE * / FILE 'datacopy99.sav' / BY Unique_ID . MISSING VALUES ALL (99). You can adapt this to use the newer multiple data set stuff in recent versions (Leave that to you as I don't have a version which supports this). I have not tested this as I am not on an OS with SPSS but it should do the trick. HTH, David
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks Bruce and David. Bruce, I went with your more labour intensive method, since I do not have enough knowledge of syntax to modify David's code.
All is well now - Margaret On 29 June 2011 23:00, David Marso [via SPSSX Discussion] <[hidden email]> wrote: In a similar way of thinking but different. -- Margaret Ryan, School of Psychology, Trinity College Dublin. Tel: Office 8963083 / 8963913 Mobile: 087 2818090 |
Free forum by Nabble | Edit this page |