Re: Multiple Imputation
Posted by
Art Kendall on
URL: http://spssx-discussion.165.s1.nabble.com/Multiple-Imputation-tp4994372p5723127.html
Your items are very
coarse, i.e., they are dichotomies. A mean or a total would
have more possible values. The possible value would range from
zero to the number of items on the scale.
If you want to try imputation, use only items that are in
the same scoring key group as contributors. That way you
preserve discriminative validity. You certainly want to avoid
using IVs as contributors to DVs and vice versa.
You can find out whether or not there is any bias in your
estimates by getting scores different ways and comparing the
results. just include the items and scores from the different
approaches in the data set. Look at the difference of each
imputed value form the mean value. Look at the difference in
the means, regression coefficients, correlations etc.
You are probably thinking that it is tedious to try the the
different approaches as a learning exercise. I guess that is
mostly because you are envisioning going back to the menus for
variations on method of handling missing data and on which set
of items (original vs with imputed values). That would be
tedious but very poor analysis handling.
If you are so far along in SPSS that you are considering
multiple imputation then you should be using the menus to draft
syntax but exiting them via <paste>. Then you should look
at the possibilities for any procedure by looking at the
<help> for the syntax. Then you can develop of syntax by
copying and pasting block of syntax. e.g., Run FACTOR with one
method of handling missing data and the original items. Copy
and paste that block of syntax in the syntax window. Change the
specification on how missing data is handled. Copy it again and
substitute the variable list that includes the items that had
imputed values. As you learn about your data and about analysis
you will want to redraft your syntax.
Also in many disciplines, there is an ethical obligation to make
you data and syntax available to those that request it.
Any method of deriving
scores (operationalizing a construct) will include some
uncertainty (i.e., bias + noise). The purpose of trying
different ways of handling missing data is to get you
information you can use to see whether you are (a) "straining at
gnats", "polishing pig iron" or "trying to make a silk purse
from a sow's ear" or (b) making meaningful improvements that
results in making the narrative you are putting more complex.
if the situation is (a) using old fashioned approaches then you
can just add an statement to your write up that you tried other
ways to forming the scores including MI but they did not make
meaningful differences. This strengthens you conclusions
If the situation is (b) variations in dealing with missing data
made substantive differences then you have a longer write up
and weaker conclusions.
In today's parlance we would say not to "put to fine a point on things" or
"use too may Sig figs".
Aristotle in his Nichomachaen Ethics put the idea well, "for
it is the mark of an educated man to look for precision in each
class of things just so far as the nature of the subject admits".
Art Kendall
Social Research Consultants
On 11/18/2013 7:22 AM, therp [via SPSSX Discussion] wrote:
Big thank you to both of you Art and Rich. I'm
implementing your
advice/ideas right now.
I have one question left though:
If my items are not categorical, meaning i have (unfortunately
only) one
interval and am measuring a continuous construct (prejudice), can
I use
the supposedly less biased, than mean substitution, Expectation
Maximization? In SPSS, I could drag my items into the window for
"continuous" variables and get estimates that look fine (of course
seperately for the 39 items, the numerical count and the likert
scale
item).
Also in that way, I could compute LIttle's MCAR test.
What do you think? I would like to use a less biased imputation
method.
I understand that MI is the best, but I don't feel comfortable
with it
yet (running many many analyses with many data sets until they can
be
finally pooled in the last step, the regression).
Art Kendall
Social Research Consultants