Re: SPSS vs. R vs. others for multiple imputation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: SPSS vs. R vs. others for multiple imputation

Poes, Matthew Joseph
It's been so long since I last looked into this, that I'm not sure what has changed in this regard.  In the past, SPSS had a problem in how they implemented MI because it's algorithm creates estimates for Z prime and Sigma prime.  Allison had argued that a better approach, as incorporated in NORM, fixed this with a Bayesian approach.  I forget the specifics now, but as I understood it, some were cautious enough as to suggest not using SPSS, and feeling its MI implementation was wrong.  I know some of these same people also did not like the SAS implementation, and preferred to use many of the other programs, including the common freeware R packages made available by some of the authors.  I know that many of the concerns were confirmed in simulations, so maybe they were legitimate concerns, we obviously can never know with actual data, since the data is missing, and if MI is used, missing at random.  If SPSS updated their MI program, then I see no reason not to consider it.  A!
 s I understand it, many of the other programs have arguably better and more complete diagnostics.  I have used SPSS and SAS, and recently R, as well as various standalone packages made available free from the Penn state group, and often don't find enough differences to indicate a bias in my own analysis.  Usually when I see clear bias, there is also evidence for NMAR, and then no MI is suitable.

In terms of ease of use, there is no denying (IMO) that SPSS is among the easiest packages to learn and use, and its MI package is congruent with that.  So much so that I think many people use it wrong (as has been evident from some recent disturbing questions posted about aggregating the MI files).  As long as people remember that MI is an entire change in the analysis process, and not just a missing data augmentation tool, they should all work pretty well.  It's worth noting that in certain kinds of data, such as very large scale surveying, it's not uncommon for hot decking to produce as little or less bias than an MI approach.  Even leaving the data missing and adding a missing data covariate has been shown at times to offer as little or less bias than MI under certain circumstances.

My advice is not to put too much time and energy into choosing an MI package.  I happen to think SPSS is a great package, but R is free.  Why not get both?  If you only need MI ability, SPSS is a waste of money, and that shouldn't be why someone would buy the package.  To me, buying SPSS simply to do MI would be akin to buying a Bentley because it has nice seats.  If you just want a comfortable chair, there are cheaper better ways to get it.  If you need a hand built British car, then SPSS might be a good bet.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Adam
Sent: Wednesday, May 16, 2012 1:58 PM
To: [hidden email]
Subject: SPSS vs R vs others for multiple imputation

Hello,

I am helping my company buy something that we can use for multiple imputation. SPSS is easier, but R is more powerful. Can anyone expand the pros and cons of using SPSS and R for multiple imputation, as well as any other statistical package that can do the job well? We want something that can do a lot but can be learned quickly. Thanks!

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: SPSS vs. R vs. others for multiple imputation

Bruce Weaver
Administrator
Before SPSS had multiple imputation, I once or twice used Stata for MI.  It has a procedure called MICE -- Multiple Imputation using Chained Equations.  IIRC, Patrick Royston contributed the procedure.  As I recall, it was fairly straightforward.  For more info search on <multiple imputation stata mice>.

HTH.


Poes, Matthew Joseph wrote
It's been so long since I last looked into this, that I'm not sure what has changed in this regard.  In the past, SPSS had a problem in how they implemented MI because it's algorithm creates estimates for Z prime and Sigma prime.  Allison had argued that a better approach, as incorporated in NORM, fixed this with a Bayesian approach.  I forget the specifics now, but as I understood it, some were cautious enough as to suggest not using SPSS, and feeling its MI implementation was wrong.  I know some of these same people also did not like the SAS implementation, and preferred to use many of the other programs, including the common freeware R packages made available by some of the authors.  I know that many of the concerns were confirmed in simulations, so maybe they were legitimate concerns, we obviously can never know with actual data, since the data is missing, and if MI is used, missing at random.  If SPSS updated their MI program, then I see no reason not to consider it.  A!
 s I understand it, many of the other programs have arguably better and more complete diagnostics.  I have used SPSS and SAS, and recently R, as well as various standalone packages made available free from the Penn state group, and often don't find enough differences to indicate a bias in my own analysis.  Usually when I see clear bias, there is also evidence for NMAR, and then no MI is suitable.

In terms of ease of use, there is no denying (IMO) that SPSS is among the easiest packages to learn and use, and its MI package is congruent with that.  So much so that I think many people use it wrong (as has been evident from some recent disturbing questions posted about aggregating the MI files).  As long as people remember that MI is an entire change in the analysis process, and not just a missing data augmentation tool, they should all work pretty well.  It's worth noting that in certain kinds of data, such as very large scale surveying, it's not uncommon for hot decking to produce as little or less bias than an MI approach.  Even leaving the data missing and adding a missing data covariate has been shown at times to offer as little or less bias than MI under certain circumstances.

My advice is not to put too much time and energy into choosing an MI package.  I happen to think SPSS is a great package, but R is free.  Why not get both?  If you only need MI ability, SPSS is a waste of money, and that shouldn't be why someone would buy the package.  To me, buying SPSS simply to do MI would be akin to buying a Bentley because it has nice seats.  If you just want a comfortable chair, there are cheaper better ways to get it.  If you need a hand built British car, then SPSS might be a good bet.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Adam
Sent: Wednesday, May 16, 2012 1:58 PM
To: [hidden email]
Subject: SPSS vs R vs others for multiple imputation

Hello,

I am helping my company buy something that we can use for multiple imputation. SPSS is easier, but R is more powerful. Can anyone expand the pros and cons of using SPSS and R for multiple imputation, as well as any other statistical package that can do the job well? We want something that can do a lot but can be learned quickly. Thanks!

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).