Hi,
I am performing a hierarchical, followed by a k-meansclusterprocedure. The 10 variabeles I am using as inputvariables for clustering have a lot of missing data. I used multiple imputation to impute the missings, but getting to know the procedure learns me that this procedure is not aimed at providing a completed dataset, but generates multiple datasets for which pooled results are delivered in analysis that support the MI. However, clustering procedures not seem to support the MI. Does anyone have any idea how to work around this that enables me to use imputed data in the clustering? Thanks in advance, Kind regards, Maaike |
Why necessarily use MI? Clustering is explorative, not significance
testing analysis. SPSS has a decent single-imputation regression
and, better, EM method imputations in Missing Value analysis
procedure. For quantitative features.
If some of your features to imput are categorical you may use Hot-doc imputation (find two macros for it on http://www.spsstools.net/en/KO-spssmacros). But most important is to decide whether to do imputations at all. Imputations are forgery, whatever they say. How many missind data you have? You say "a lot". If it is above 20% forget about doing imputations. Analyse just complete cases. 21.10.2015 18:32, MaaikeSmits пишет:
Hi, I am performing a hierarchical, followed by a k-meansclusterprocedure. The 10 variabeles I am using as inputvariables for clustering have a lot of missing data. I used multiple imputation to impute the missings, but getting to know the procedure learns me that this procedure is not aimed at providing a completed dataset, but generates multiple datasets for which pooled results are delivered in analysis that support the MI. However, clustering procedures not seem to support the MI. Does anyone have any idea how to work around this that enables me to use imputed data in the clustering? Thanks in advance, Kind regards, Maaike -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Clustering-procedure-tp5730814.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by MaaikeSmits
-->If some of your features to imput are categorical
I meant to say: if the background variables (which the imputed one depends on) are categorical. The variable being imputed can be categorical or quantitative. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Kirill Orlov
Also - I don't know how other people figure it, but I prefer
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
imputation when the missing is At Random, and each Missing is unrelated to the fact of some other item being missing. That is mainly true when the Missing is accidental, rather than meaningful in any sense. (And that is why I have never been enthusiastic about imputation... folks do it far too casually.) If there is "a lot" of missing, perhaps the first clustering ought to be done to look at Missing/ Not missing. If there are one or two big reasons for Missing, that should be sorted out at the start. -- Rich Ulrich Date: Wed, 21 Oct 2015 20:56:37 +0300 From: [hidden email] Subject: Re: Clustering procedure To: [hidden email] Why necessarily use MI? Clustering is explorative, not significance testing analysis. SPSS has a decent single-imputation regression and, better, EM method imputations in Missing Value analysis procedure. For quantitative features. If some of your features to imput are categorical you may use Hot-doc imputation (find two macros for it on http://www.spsstools.net/en/KO-spssmacros). But most important is to decide whether to do imputations at all. Imputations are forgery, whatever they say. How many missind data you have? You say "a lot". If it is above 20% forget about doing imputations. Analyse just complete cases. 21.10.2015 18:32, MaaikeSmits пишет:
Hi, I am performing a hierarchical, followed by a k-meansclusterprocedure. The 10 variabeles I am using as inputvariables for clustering have a lot of missing data. I used multiple imputation to impute the missings, but getting to know the procedure learns me that this procedure is not aimed at providing a completed dataset, but generates multiple datasets for which pooled results are delivered in analysis that support the MI. However, clustering procedures not seem to support the MI. Does anyone have any idea how to work around this that enables me to use imputed data in the clustering? Thanks in advance, Kind regards, Maaike -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Clustering-procedure-tp5730814.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |