Hello - a couple questions about MI
1) When creating a database to impute missing values, is it redundant to include subscales as well as summary score ( of the same measure) , if these are just auxiliary variables to inform the imputation? I am worried about multicollinearity, but am not sure if that is really an issue with imputation. 2) What is the best way to determine how many datasets to impute? (People in our lab have used Rubin's table of relative efficiencies, others have used the general rule to impute as many datasets as there is percentage of missing values (25% missing values, impute 25 data sets).
Jana H. Chaudhuri, Ph.D.
Lecturer, Child Development Data Manager, Massachusetts Healthy Families Evaluation Tufts University 177 College Avenue Medford, MA 02155 |
1) Remember that MI involves a regression equation estimation step. So,
multicollinearity can occur. Whether MI programs/routines will test for and issue warning messages, I don't know. 2) I haven't ever seen the general rule you refer to. Since you are at a university do a search on John Graham or, maybe, Todd Little or Scott Maxwell. One of them, I think it was John did a study of number of imputations and power (and the answer was not 5) that appeared in either Psych Methods or SEM. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jchaudhu Sent: Thursday, October 13, 2011 2:10 PM To: [hidden email] Subject: Multiple Imputation model and questions Hello - a couple questions about MI 1) When creating a database to impute missing values, is it redundant to include subscales as well as summary score ( of the same measure) , if these are just auxiliary variables to inform the imputation? I am worried about multicollinearity, but am not sure if that is really an issue with imputation. 2) What is the best way to determine how many datasets to impute? (People in our lab have used Rubin's table of relative efficiencies, others have used the general rule to impute as many datasets as there is percentage of missing values (25% missing values, impute 25 data sets). ----- Jana H. Chaudhuri, Ph.D. Lecturer, Child Development Data Manager, Massachusetts Healthy Families Evaluation Tufts University 177 College Avenue Medford, MA 02155 -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-model-and- questions-tp4900070p4900070.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thank you for your response. Another person I asked has referred me to this article regarding number of datasets:
Graham, Olchowski, and Gilreath 2007 On Thu, Oct 13, 2011 at 2:40 PM, Gene Maguin [via SPSSX Discussion] <[hidden email]> wrote: 1) Remember that MI involves a regression equation estimation step. So,
Jana H. Chaudhuri, Ph.D.
Lecturer, Child Development Data Manager, Massachusetts Healthy Families Evaluation Tufts University 177 College Avenue Medford, MA 02155 |
Free forum by Nabble | Edit this page |