Re: Multiple Imputation

Posted by Art Kendall on
URL: http://spssx-discussion.165.s1.nabble.com/Multiple-Imputation-tp4994372p5723127.html

Your items are very coarse, i.e., they are dichotomies.  A mean or a total would have more possible values. The possible value would range from zero to the number of items on the scale.
If you want to try imputation, use only items that are in the same scoring key group as contributors.  That way you preserve discriminative validity.  You certainly want to avoid using IVs as contributors to DVs and vice versa.

You can find out whether or not there is any bias in your estimates by getting scores different ways and comparing the results. just include the items and scores from the different approaches in the data set.  Look at the difference of each imputed value form the mean value.  Look at the difference in the means, regression coefficients, correlations etc. 

You are probably thinking that it is tedious to try the the different approaches as a learning exercise.  I guess that is mostly because you are envisioning going back to the menus for variations on method of handling missing data and on which set of items (original vs with imputed values).  That would be tedious but very poor analysis handling.

If you are so far along in SPSS that you are considering multiple imputation then you should be using the menus to draft syntax but exiting them via <paste>.  Then you should look at the possibilities for any procedure by looking at the <help> for the syntax.  Then you can develop of syntax by copying and pasting block of syntax. e.g., Run FACTOR with one method of handling missing data and the original items.  Copy and paste that block of syntax in the syntax window. Change the specification on how missing data is handled. Copy it again and substitute the variable list that includes the items that had imputed values.  As you learn about your data and about analysis you will want to redraft your syntax.
Also in many disciplines, there is an ethical obligation to make you data and syntax available to those that request it.

Any method of deriving scores (operationalizing a construct) will include some uncertainty (i.e., bias + noise). The purpose of trying different ways of handling missing data is to get you information you can use to see whether you are (a) "straining at gnats", "polishing pig iron" or "trying to make a silk purse from a sow's ear" or (b) making meaningful improvements that results in making the narrative you are putting more complex. 

if the situation is (a) using old fashioned approaches then you can just add an statement to your write up that you tried other ways to forming the scores including MI but they did not make meaningful differences.  This strengthens you conclusions
If the situation is (b) variations in dealing with missing data made substantive differences  then you have a longer write up and weaker conclusions.

In today's parlance we would say not to
"put to fine a point on things"  or "use too may Sig figs".
Aristotle in his Nichomachaen Ethics put the idea well, "
for it is the mark of an educated man to look for precision in each class of things just so far as the nature of the subject admits".
Art Kendall
Social Research Consultants
On 11/18/2013 7:22 AM, therp [via SPSSX Discussion] wrote:
Big thank you to both of you Art and Rich. I'm implementing your
advice/ideas right now.

I have one question left though:

If my items are not categorical, meaning i have (unfortunately only) one
interval and am measuring a continuous construct (prejudice), can I use
the supposedly less biased, than mean substitution, Expectation
Maximization? In SPSS, I could drag my items into the window for
"continuous" variables and get estimates that look fine (of course
seperately for the 39 items, the numerical count and the likert scale
item).
Also in that way, I could compute LIttle's MCAR test.

What do you think? I would like to use a less biased imputation method.
I understand that MI is the best, but I don't feel comfortable with it
yet (running many many analyses with many data sets until they can be
finally pooled in the last step, the regression).




If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Multiple-Imputation-tp4994372p5723125.html
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants