Hello,
I have a data set that has some categorical variables (both binary outcome variables and variables having more than two categories) and some continuous variables. I can use SPSS to impute missing values for continuous variables by EM algorithm. But how do I impute missing values for the both types of categorical variables? Is there any macro for doing that? Note that, I will use the complete data set for a factor analysis. So, multiple imputation may create problems to combine the results of each imputed data set. What is your suggestion in this context? Someone has suggested me hot-deck imputation instead. But how do I do this in SPSS? Thank you. |
Blain,
I'm not familiar with how imputation works in spss. I assume that people working on imputation have written on the problem of categorical variables. I'm not familiar with the literature; perhaps others who are will comment and give citations. Doing it is not too hard. You have to know serious syntax. Look at Ray Levesque's web site, spsstools.net, bottom of the page under missing data, he has a hot deck routine. It may need to be adapted to your situation, or not. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Blain Waan Sent: Friday, July 27, 2012 7:39 AM To: [hidden email] Subject: Imputation of categorical missing values Hello, I have a data set that has some categorical variables (both binary outcome variables and variables having more than two categories) and some continuous variables. I can use SPSS to impute missing values for continuous variables by EM algorithm. But how do I impute missing values for the both types of categorical variables? Is there any macro for doing that? Note that, I will use the complete data set for a factor analysis. So, multiple imputation may create problems to combine the results of each imputed data set. What is your suggestion in this context? Someone has suggested me hot-deck imputation instead. But how do I do this in SPSS? Thank you. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Imputation-of-categorical-missing-values-tp5714496.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Blain Waan
Are you using the EM algorithm through
the MVA command? If so, the command syntax documentation (http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/topic/com.ibm.spss.statistics.help/syn_mva.htm)
explains how to specify categorical variables to be imputed; it's a /CATEGORICAL
subcommand with the list of categorical variables.
Alex |
In reply to this post by Blain Waan
I know that post is about 2 years now. But I have some experience in PMM
(predictive mean matching) and for those who have both categorical/binary and continuous data, I would never recommend multiple regression method. Normally, you should go to ->multiple imputation ->impute missing data values, ->custom (MCMC) and then select PMM. It's like a hot deck imputation, and it uses real values from your data. The difference is that you use regression method, you may have values like 1.35547226 or 2.38446341 even though SPSS will round the value at 1 and 2, respectively, because obviously your categorical variables do not normally contain any numbers after comma. But when you perform a histogram, you see that it looks ugly. If you variable can only take on values like 0, 1, 2, 3, 4, and 5, PMM is excellent because it gives you exactly the values of 0, 1, 2, 3, 4, and 5, according to the matching. Some caveat of PMM or hot deck concerns small data (both hot deck and PMM) because it is difficult to do the data matching (or, to find a "donor" if you're familiar with the literature) when you have few cases. See this article for further discussion : A Review of Hot Deck Imputation for Survey Non-response Rebecca R. Andridge and Roderick J. A. Little <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130338/> . Now that I am at it, is there any expert users here ? I want to work with hot deck imputation. Normally SPSS doesn't have it, but Teresa Myers in the article Goodbye, Listwise Deletion: Presenting Hot Deck Imputation as an Easy and Effective Tool for Handling Missing Data <http://www.afhayes.com/public/hotdeck.pdf> presents the HD imputation, and at the end (.ie., Appendix) of the paper, gives the macro for creating the hot deck command in SPSS. It is written "Execute the command set below in an SPSS syntax window exactly as is". And yet I can't get it worked (I use spss 20). After executing the syntax, nothing happens. (note there is two pages for the entire syntax). Any ideas ? Thanks. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Imputation-of-categorical-missing-values-tp5714496p5724113.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Blain Waan
I have successfully used her macro many times. After executing the macro
are you providing a hotdeck command? For example: HOTDECK y = variables with missing data /deck = variables defining the decks . On Thu, 23 Jan 2014 14:06:06 -0800, M.H. <[hidden email]> wrote: >I know that post is about 2 years now. But I have some experience in PMM >(predictive mean matching) and for those who have both categorical/binary >and continuous data, I would never recommend multiple regression method. >Normally, you should go to ->multiple imputation ->impute missing data >values, ->custom (MCMC) and then select PMM. It's like a hot deck >imputation, and it uses real values from your data. The difference is that >you use regression method, you may have values like 1.35547226 or 2.38446341 >even though SPSS will round the value at 1 and 2, respectively, because >obviously your categorical variables do not normally contain any numbers >after comma. But when you perform a histogram, you see that it looks ugly. >If you variable can only take on values like 0, 1, 2, 3, 4, and 5, PMM is >excellent because it gives you exactly the values of 0, 1, 2, 3, 4, and 5, >according to the matching. Some caveat of PMM or hot deck concerns small >data (both hot deck and PMM) because it is difficult to do the data matching >(or, to find a "donor" if you're familiar with the literature) when you have >few cases. >See this article for further discussion : A Review of Hot Deck Imputation >for Survey Non-response >Rebecca R. Andridge and Roderick J. A. Little ><http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130338/> . > >Now that I am at it, is there any expert users here ? > >I want to work with hot deck imputation. Normally SPSS doesn't have it, but >Teresa Myers in the article Goodbye, Listwise Deletion: Presenting Hot Deck >Imputation as an Easy and Effective Tool for Handling Missing Data ><http://www.afhayes.com/public/hotdeck.pdf> presents the HD imputation, >and at the end (.ie., Appendix) of the paper, gives the macro for creating >the hot deck command in SPSS. > >It is written "Execute the command set below in an SPSS syntax window >exactly as is". And yet I can't get it worked (I use spss 20). After >executing the syntax, nothing happens. (note there is two pages for the >entire syntax). > >Any ideas ? Thanks. > > > >-- >View this message in context: >Sent from the SPSSX Discussion mailing list archive at Nabble.com. > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by M.H.
Maybe you should read up on how to RUN macros and follow the instructions from resources!
HOTDECK y = variables with missing data/deck = variables defining the decks.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |