Login  Register

Re: Question on INCLUDE instruction when managing missing data in a FACTOR ANALYSIS

Posted by Rich Ulrich on Dec 06, 2018; 6:14pm
URL: http://spssx-discussion.165.s1.nabble.com/Question-on-INCLUDE-instruction-when-managing-missing-data-in-a-FACTOR-ANALYSIS-tp5737094p5737108.html

Bruce,
 - those are worth-while comments.

I wish I had said
> Mean-substituting is not /terrible/ for MAR... for factor analysis.

The choice may be "conservative results" versus "results based on artifacts."

And I did say, Do the factoring two ways and compare the results.  A problem
for other replacement for factor analysis is that the algorithm shapes the
factor results.

MISSINGs create other problems for inference and testing, even when you
meet the assumptions of Missing at Random. And I don't like to trust that
Missings are at random.

If you replace a large number of Missings, the over-estimate of d.f. for tests
might be too much.

Also, in a sample size of N with k replacements, you are messing with k/N
(expected) share of the variance (though, you hope, the "mess" is small for
each case). But that k/N suggests that in an ANOVA setting, an R-squared
near 1.0  is much more disrupted than an R-squared near zero.  Compare the
fraction k/N  to an error variance of the underlying data that is 5%, to the
case where it is 95%. Roughly speaking.

When there is a bunch missing, you really need to be careful, and I don't
think there's a single answer that fits all cases for multivariate data.

--
Rich Ulrich


From: SPSSX(r) Discussion <[hidden email]> on behalf of Bruce Weaver <[hidden email]>
Sent: Thursday, December 6, 2018 10:19 AM
To: [hidden email]
Subject: Re: Question on INCLUDE instruction when managing missing data in a FACTOR ANALYSIS
 
Rich Ulrich wrote
> --- snip ---
> Mean-substituting is not /terrible/ for MAR. 
> --- snip ---

Rich, John Graham (well known author on missing data) would not agree with
you.  This is an excerpt from his book, Missing Data - Analysis and Design
(p. 51).

--- start of excerpt --
Mean substitution is a strategy in which the mean is calculated for the
variable based on all cases that have data for that variable.  This mean is
then used in place of any missing value on that variable.

This is the worst of all possible strategies. Inserting the mean in place of
the missing value reduces variance on the variable and plays havoc with
covariances and correlations. Also, there is no straightforward way to
estimate standard errors.  Because of all the problems with this strategy, I
believe that using it amounts to nothing more than pretending that no data
are missing.  I recommend that people should NEVER use this procedure.  If
you absolutely must pretend that you have no missing data, a much better
strategy, and one that is almost as easy to implement, is to impute a single
data set from EM parameters (see Chaps. 3 and 7) and use that.
-- end of excerpt ---

Here is a PDF of the book--it is on the Springer website, so would seem not
to be violating copyright.

https://link.springer.com/content/pdf/10.1007%2F978-1-4614-4018-5.pdf

Another nice (shorter) resource by Graham is his 2009 Annual Review of
Psychology chapter:

https://www.personal.psu.edu/jxb14/M554/articles/Graham2009.pdf

HTH.




-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD