SPSSX Discussion

Some question on the (implementation of the) Multiple Imputation Technique

Classic

List

Threaded

2 messages Options

Hoogendoorn, Adriaan

Some question on the (implementation of the) Multiple Imputation Technique

Dear SPSS Listserv,

A group of people at this institute are trying to get a grip on the
Multiple Imputation technique and its implementation in SPSS 17 (PASW
17.0.2). Our first impression is that SPSS did a good job in implementing
this technique. We are trying to understand why the amount of information
(parameter estimates, std errors, etc.) of the pooled estimators differs
from the amount of information for the original and separate imputed data
sets. We have a few questions to understand the technique and its
implementation better.

1. ONEWAY
No results are shown for the pooled data set. To me it seems that an F-
statistic is 'just' a multivariate version of a t-statistic and therefore
(obviously) more complicated. My question is: is it just âseriously more
complicatedâ to pool F-statistics (so we may expect it in a later version)
or is it theoretically impossible?

2. Linear regression (OLS).
In linear regression the Model Summary statistics and the ANOVA tables are
unavailable for the pooled dataset - possibly for the same reason as stated
in the previous paragraph. In the Parameter Estimates table no pooled
estimates for the Standardized Coefficients (what SPSS calls beta) are
given. Is it theoretically impossible to provide this information, or did
SPSS forget to implement it?

3. Binary Logistic regression
We asked for Confidence Intervals for the Odds Ratio's (Options - CI for exp
(B)), but found that this info was only shown for the original data and the
separate imputed data sets - not for the Pooled data set. This does seem to
be an omission by SPSS, since the Multinominal Logistic regression applied
on the dichotomous dependent variable does provide this information.

Any help is greatly appreciated.

Kind regards,

Adriaan Hoogendoorn
GGZ inGeest, Amsterdam

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

SPSS Support

Re: Some question on the (implementation of the) Multiple Imputation Technique

Dear Adriaan,

You may want to have a look at the algorithms for pooling for MI data, available via Help>Algorithms>Multiple Imputation: Pooling Algorithms, to see how we're doing things. We're using Rubin's rules to pool point estimates and their variances, and computing test statistics based on these. We're not pooling any test statistics or p values directly. So we don't pool any t statistics. We pool parameter estimates and their variances and compute test statistics from the pooled results according to Rubin's methods. Thus, there are no pooled results presented for ANOVA tables or summary model tests for any procedures. Also, there are many places where naive pooling, which just gives pooling of a single value, with no error estimate, is done. Other places, such as with regression coefficients, produce univariate pooling, which does produce inferential results.

In the Help system, if you look on the Contents tab, you can find Missing Values Option>Multiple Imputation>Analyzing Multiple Imputation Data. On that Help page, there is a link to open an item labeled Levels of Pooling, which discusses the two levels of pooling, and another link to open an item labeled Procedures That Support Pooling, which gives a list of which tables and statistics in which procedures support pooling and at what levels of combination.

1. In ONEWAY, you can get pooled results for contrast results, based on the point estimates and their standard errors (univariate pooling results), and naive pooling of group means and standard errors in the Descriptives table.

2. Yes, for the same reason, there are no pooled results for the ANOVA table or Model Summary results. For the standardized coefficients, by default there are no standard errors available, but you can ask for them. In either case, no pooled results are available. I'm going to have to look into this further to see whether we had a theoretical concern about it, or just overlooked it. We were supposed to present naive pooling of correlations (zero order, part and partial), and these are not included, so I'll file a bug report on that, and check further into the standardized coefficient status.

3. This seems to be just a bug. This should have been included. Our apologies. I'll file a report.

David Nichols

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Adriaan Hoogendoorn
Sent: Wednesday, May 27, 2009 7:57 AM
To: [hidden email]
Subject: [SPSSX-L] Some question on the (implementation of the) Multiple Imputation Technique

Dear SPSS Listserv,

A group of people at this institute are trying to get a grip on the Multiple Imputation technique and its implementation in SPSS 17 (PASW 17.0.2). Our first impression is that SPSS did a good job in implementing this technique. We are trying to understand why the amount of information (parameter estimates, std errors, etc.) of the pooled estimators differs from the amount of information for the original and separate imputed data sets. We have a few questions to understand the technique and its implementation better.

1. ONEWAY
No results are shown for the pooled data set. To me it seems that an F- statistic is 'just' a multivariate version of a t-statistic and therefore
(obviously) more complicated. My question is: is it just â€˜seriously more complicatedâ€™ to pool F-statistics (so we may expect it in a later version) or is it theoretically impossible?

2. Linear regression (OLS).
In linear regression the Model Summary statistics and the ANOVA tables are unavailable for the pooled dataset - possibly for the same reason as stated in the previous paragraph. In the Parameter Estimates table no pooled estimates for the Standardized Coefficients (what SPSS calls beta) are given. Is it theoretically impossible to provide this information, or did SPSS forget to implement it?

3. Binary Logistic regression
We asked for Confidence Intervals for the Odds Ratio's (Options - CI for exp (B)), but found that this info was only shown for the original data and the separate imputed data sets - not for the Pooled data set. This does seem to be an omission by SPSS, since the Multinominal Logistic regression applied on the dichotomous dependent variable does provide this information.

Any help is greatly appreciated.

Kind regards,

Adriaan Hoogendoorn
GGZ inGeest, Amsterdam

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD