The original data results for the dependent t-test has a p value of .264 (n=369). The multiple imputation data results for the p value are 1. .009 (n=496) 2. .009 (n-490) 3. .005 (n=496) 4. .013 (n=494) 5. .014 (n=494) Pooled .010 (n=11066) Is this reasonable? Can it be attributed to the increase in the sample size? But why would the sample size (each of the five imputed data sets) increase so much from the
original data set? What am I missing? Thoughts? Thanks, Martin F. Sherman, Ph.D. Professor of Psychology Director of Master’s Education: Thesis Track Department of Psychology 222 B Beatty Hall 4501 North Charles Street Baltimore, MD 21210 410-617-2417 tel 410-617-5341 fax |
You show five results. Can we assume only five imputations? If so, why is the pooled N=11066? Why aren’t the imputation Ns the same? How similar were the means, SDs and the correlation for each imputation to
each other and to the original dataset? This set of results seems odd to me and would encourage me to dig into the similarity of the values for the dependent t-test components. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Martin Sherman
The original data results for the dependent t-test has a p value of .264 (n=369). The multiple imputation data results for the p value are 1. .009 (n=496) 2. .009 (n-490) 3. .005 (n=496) 4. .013 (n=494) 5. .014 (n=494) Pooled .010 (n=11066) Is this reasonable? Can it be attributed to the increase in the sample size? But why would the sample size (each of the five imputed data sets) increase so much from the
original data set? What am I missing? Thoughts? Thanks, Martin F. Sherman, Ph.D. Professor of Psychology Director of Master’s Education: Thesis Track Department of Psychology 222 B Beatty Hall 4501 North Charles Street Baltimore, MD 21210 410-617-2417 tel 410-617-5341 fax ===================== To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
|
Gene: Yes, I noticed that too. Let me contact my graduate student (who ran the MI) and see what is going on. Those n’s should be the same. The means and sds for the IM data sets are very similar but quite different
from the original data set. The SEs get smaller for the IM results given the increase in sample size. From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Maguin, Eugene You show five results. Can we assume only five imputations? If so, why is the pooled N=11066? Why aren’t the imputation Ns the same? How similar were the means, SDs and the correlation for each imputation to
each other and to the original dataset? This set of results seems odd to me and would encourage me to dig into the similarity of the values for the dependent t-test components. Gene Maguin From: SPSSX(r) Discussion [[hidden email]]
On Behalf Of Martin Sherman
The original data results for the dependent t-test has a p value of .264 (n=369). The multiple imputation data results for the p value are 1. .009 (n=496) 2. .009 (n-490) 3. .005 (n=496) 4. .013 (n=494) 5. .014 (n=494) Pooled .010 (n=11066) Is this reasonable? Can it be attributed to the increase in the sample size? But why would the sample size (each of the five imputed data sets) increase so much from the
original data set? What am I missing? Thoughts? Thanks, Martin F. Sherman, Ph.D. Professor of Psychology Director of Master’s Education: Thesis Track Department of Psychology 222 B Beatty Hall 4501 North Charles Street Baltimore, MD 21210 410-617-2417 tel 410-617-5341 fax ===================== To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
|
In reply to this post by msherman
I have no experience doing imputation, but I have spent much time considering
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
the "paired t test" model with extra data at Pre and/or Post. If it is Pre-Post, there is often an important reason for absence at Post, failure or dropout. - Look into this by comparing the paired Pre cases to the unpaired ones, since Missing-not-at-random probably undermines imputation. Do similarly for Post, if there are enough cases. If it is some other pairing, like Left-Right, the relative number of Missing may deserve comment. If unequal, did you expect that? Consider what you have for testing your hypothesis Without imputation: - paired t-test where data are complete: Means, SDs and correlation. - unpaired t-test between the other scores: Means, SDs. Either these are consistent (means and tests), or they are not. If not, ask Why not? Now, if you want to look further into the impact of imputation, you have three groups of cases that you might compare on the variables used in the imputation: Missing at Pre, Missing at Post, None Missing. - If these groups differ on the imputing variables, then, if imputation will tend to create differences in the paired t, to whatever extent that the imputation is stronger than mean-replacement. -- Rich Ulrich Date: Tue, 28 Jul 2015 12:56:30 +0000 From: [hidden email] Subject: Multiple Imputation results versus original data results To: [hidden email]
The original data results for the dependent t-test has a p value of .264 (n=369). The multiple imputation data results for the p value are 1. .009 (n=496) 2. .009 (n-490) 3. .005 (n=496) 4. .013 (n=494) 5. .014 (n=494) Pooled .010 (n=11066)
Is this reasonable? Can it be attributed to the increase in the sample size? But why would the sample size (each of the five imputed data sets) increase so much from the original data set? What am I missing? Thoughts? Thanks,
|
In reply to this post by msherman
I suspect that what happened is that the
multiply imputed dataset was analyzed with the splits turned off, so it
appeared that you had five times as much data as you actually have. The
MI procedure produces one big datasets split by the imputation number (and
a variable named Imputation_ that defines the split). Split by that
variable must be turned on, which it is automatically after the imputation,
in order to get any valid results.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Martin Sherman <[hidden email]> To: [hidden email] Date: 07/28/2015 07:58 AM Subject: Re: [SPSSX-L] Multiple Imputation results versus original data results Sent by: "SPSSX(r) Discussion" <[hidden email]> Gene: Yes, I noticed that too. Let me contact my graduate student (who ran the MI) and see what is going on. Those n’s should be the same. The means and sds for the IM data sets are very similar but quite different from the original data set. The SEs get smaller for the IM results given the increase in sample size. From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Maguin, Eugene Sent: Tuesday, July 28, 2015 9:10 AM To: [hidden email] Subject: Re: Multiple Imputation results versus original data results You show five results. Can we assume only five imputations? If so, why is the pooled N=11066? Why aren’t the imputation Ns the same? How similar were the means, SDs and the correlation for each imputation to each other and to the original dataset? This set of results seems odd to me and would encourage me to dig into the similarity of the values for the dependent t-test components. Gene Maguin From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Martin Sherman Sent: Tuesday, July 28, 2015 8:57 AM To: [hidden email] Subject: Multiple Imputation results versus original data results Dear All: I just ran a dependent t-test on a multiple imputation data set and obtained some strange results (so I think). The original data results for the dependent t-test has a p value of .264 (n=369). The multiple imputation data results for the p value are 1. .009 (n=496) 2. .009 (n-490) 3. .005 (n=496) 4. .013 (n=494) 5. .014 (n=494) Pooled .010 (n=11066) Is this reasonable? Can it be attributed to the increase in the sample size? But why would the sample size (each of the five imputed data sets) increase so much from the original data set? What am I missing? Thoughts? Thanks, Martin F. Sherman, Ph.D. Professor of Psychology Director of Master’s Education: Thesis Track Department of Psychology 222 B Beatty Hall 4501 North Charles Street Baltimore, MD 21210 msherman@... 410-617-2417 tel 410-617-5341 fax ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@...(not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
|
Free forum by Nabble | Edit this page |