Urgent : Factor Analysis - Data reduction

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Urgent : Factor Analysis - Data reduction

Deepanshu Bhalla
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Urgent : Factor Analysis - Data reduction

Art Kendall


In a perfect world each item would load cleanly (i.e., highly on one factor, and trivially on the others).

see the recent discussion on this list which can be found at
 http://spssx-discussion.1045642.n5.nabble.com/Factor-Analysis-tp5707166.html
If you then have further questions, feel free to post more queries on this list providing a more detailed description of your situation.

Art Kendall Social
Research Consultants

On 5/17/2012 2:07 PM, Deepanshu Bhalla wrote:
Hi Team

I run factor analysis on 48 variables . Screen plot shows 3-4 factors to be
considered good for analysis .

I wanna know the criteria to remove redundant variables .

If loading value in any variable comes up low should we remove the variable?

In general ,loading value comes up low for one factor but high for another
variable .

Can any one please tell me what the points should we keep in mind taking the
final decision to eliminate the variables ?

I am highly confused about low or high loading value to be considered for
data reduction.

Thanks in advance !


--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Urgent-Factor-Analysis-Data-reduction-tp5711524.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Urgent : Factor Analysis - Data reduction

Poes, Matthew Joseph
In reply to this post by Deepanshu Bhalla
See my answers below:

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Deepanshu Bhalla
Sent: Thursday, May 17, 2012 1:08 PM
To: [hidden email]
Subject: Urgent : Factor Analysis - Data reduction

Hi Team

I run factor analysis on 48 variables . Screen plot shows 3-4 factors to be considered good for analysis .

I wanna know the criteria to remove redundant variables .
*MP:  Redundant variables will load high on a factor, but be highly correlated with another variable in the same factor.  For me a variable would need to be very highly correlated and have an equally high factor loading for me to consider removing it (unless I really needed to reduce a measure down).  The reason is that you might consider this like triangulation with a concept that is really hard to pinpoint its location.  A redundant variable would be like two perfectly overlapping circles.  Since the redundancy is rarely perfect, imagine two circles that mostly overlap (say 85%), well that slight difference doesn't hurt you, since you aren't using the individual variables in a final analysis, but the factor (i.e. no loss of DF), but it does help you some, not a lot, but enough to retain it if you have it.

If loading value in any variable comes up low should we remove the variable?
*MP:  This is up to you, but I would remove it.  Often people use .2 or .3, some even use .5 as minimum loading values to keep a variable in that factor.

In general ,loading value comes up low for one factor but high for another variable .
*MP: If there are in fact unique dimensions, which reflect different latent constructs, then the variables associated with those constructs should uniquely be associated with one factor, but not another.  This means it will have a high value for the factor it's associated with, and a low value for a different factor.
*MP:  If a retain 3 factors, and a variable loads low on those 3, but high on a 4th you aren't retaining, then you would consider dropping that variable, you would ignore it's high loading, as you aren't keeping that construct.  Remember, the reason to drop the factor is because it couldn't even manage to account for more variance than its individual variables could on their own, so just because a variable loaded well on it doesn't make it useful.  Just include that single variable in a model on its own.

Can any one please tell me what the points should we keep in mind taking the final decision to eliminate the variables ?
*MP:  While factor analysis is a statistical approach to data reduction, you must keep in mind that it's of little use if the factors it creates are nonsense.  Some people will argue that the factors should fit with theory.  This may be true, but it does assume that the questions used to create the factors fit with theory, and this may not be true.  My recommendation in this exploratory stage of construct development is to seriously consider what the constructs reflect, based on their component variables, and loading values.  Variables with higher loading variables are more dominant in the factor, they reflect the centroid more explicitly, if you will.  If two or three variable each ask something somewhat similar, and have very high loadings, this may tell you that the latent construct best reflects what these variables are getting at.

I am highly confused about low or high loading value to be considered for data reduction.
*MP: Think of them as correlations to the latent construct, as that's what they are.  The latent construct reflects the homogenous concept that the variables reflect as a whole.  The correlation is to a central point the variables collectively create (Think of triangulation in geometry here), once this point is established, a distribution can be assigned, and the correlation tested to each component variable.  This becomes the variables component loading.

Thanks in advance !


--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Urgent-Factor-Analysis-Data-reduction-tp5711524.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD