I have ran factor analysis (PCA to be precise) for a new scale I am creating and the scree plot appears to suggest I retain 4-5 components on the two scales. However, the final scales don't make coherent conceptual sense. Given that there are many items on the scale - 40 items, two to three items make sense...My main question is, is it possible in the circumstance to arbitrarily move items to the components where they make the most conceptual sense?
|
Administrator
|
"My main question is, is it possible in the circumstance to arbitrarily move items to the components where they make the most conceptual sense? "
Why even bother running a factor analysis in the first place if you are going to contemplate doing this? What are the reliabilities of the 4-5 scales created from the items loading on the components? --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks David. The reliabilities of all 4-5 sub-scales are between .84 and .9...
The question is how best to get rid of the superflus items... |
Administrator
|
In reply to this post by Promises
Disclaimer: I am not an expert on factor analysis.
Having said that, I'll now rush in where angels fear to tread. The following statements are excerpts from the article I've attached to this message. (Those who are using the UGA mailing list will have to view it in the Nabble archive to get the attachment. The journal is now defunct, which is why I don't have any qualms about uploading the article here--it's very difficult to get hold of articles from that journal these days.) "[Exploratory factor analysis] is a method of identifying unobservable [latent variables] that account for the (co)variances among [measured variables]." "The utility of PCA, on the other hand, lies in data reduction. PCA yields observable composite variables (components), which account for a mixture of common and unique sources of variance (including random error)." Q. What are you trying to do? Are you looking for latent variables, or are you interested in data reduction? My suspicion is that you are looking for latent variables. If so, you should not be using PCA. You've not said what rotation method you're using. Here are some more excerpts to consider if you've gone with an orthogonal rotation of some sort. "One of Thurstone’s (1935, 1947) major contributions to factor analysis methodology was the recognition that factor solutions should be rotated to reflect what he called simple structure to be interpreted meaningfully.[7] Simply put, given one factor loading matrix, there are an infinite number of factor loading matrices that could account for the variances and covariances among the MVs [i.e., measured variables] equally as well. Rotation methods are designed to find an easily interpretable solution from among this infinitely large set of alternatives by finding a solution that exhibits the best simple structure. Simple structure, according to Thurstone, is a way to describe a factor solution characterized by high loadings for distinct (non-overlapping) subsets of MVs and low loadings otherwise." "In general, if the researcher does not know how the factors are related to each other, there is no reason to assume that they are completely independent. It is almost always safer to assume that there is not perfect independence, and to use oblique rotation instead of orthogonal rotation.[8] Moreover, if optimal simple structure is exhibited by orthogonal factors, an obliquely rotated factor solution will resemble an orthogonal one anyway (Floyd & Widaman, 1995), so nothing is lost by using oblique rotation. Oblique rotation offers the further advantage of allowing estimation of factor correlations, which is surely a more informative approach than assuming that the factors are completely independent." HTH. Preacher_2003_Repairing_Tom_Swifts_electric_factor_analysis_machine.pdf
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Promises
Questions. Are these scales of items that were rated?
What is the range of the scores? What is the typical intercorrelation? What is the total N? For rating scales, PFA is more the standard than PCA when you want to find what variance is held in common. PCA gives too many separate factors. Dichotomous variables give too many factors, with a bias for joining variables with similar skew. The loading you look at are the ones in the factor structure matrix, which has the item-to-factor correlations. The matrix before rotation will yield the largest factor as the overall total score, reflecting the generally-positive correlations among items on a scale. The second factor will be "bipolar" in the sense of tending to have one set of high correlations that are positive, and another set that are negative. Almost everyone prefers to obtain scales from a rotated solution. A varimax rotation usually does a pretty good job of assigning items to scales, using a cutoff like 0.35 or 0.45. An item that loads on more than one factor is usually placed in one or the other, not both. State clearly whether you use the rule of "higher loading" or "going to the shorter list". When you construct scores in the usual way, by averaging the items with high loadings, the factors you score up *will* be correlated, even though varimax is an "orthogonal rotation" of the raw solution. In my experience, an oblique solution like Promax or Oblimin presents many more loadings of higher value, without making the structure any "simpler" or more obvious. -- Rich Ulrich > Date: Tue, 19 Nov 2013 10:18:03 -0800 > From: [hidden email] > Subject: Factor analysis > To: [hidden email] > > I have ran factor analysis (PCA to be precise) for a new scale I am creating > and the scree plot appears to suggest I retain 4-5 components on the two > scales. However, the final scales don't make coherent conceptual sense. > Given that there are many items on the scale - 40 items, two to three items > make sense...My main question is, is it possible in the circumstance to > arbitrarily move items to the components where they make the most conceptual > sense? > ... |
In reply to this post by Promises
once you chose variables it's confirmatory fa
try Amos add in best diana Sent from my iPhone > On 19 Nov 2013, at 10:54 pm, "Promise" <[hidden email]> wrote: > > I have ran factor analysis (PCA to be precise) for a new scale I am creating > and the scree plot appears to suggest I retain 4-5 components on the two > scales. However, the final scales don't make coherent conceptual sense. > Given that there are many items on the scale - 40 items, two to three items > make sense...My main question is, is it possible in the circumstance to > arbitrarily move items to the components where they make the most conceptual > sense? > > > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Factor-analysis-tp5723162.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Promises
If you are creating
scales, you would want to us Principle Axes.
The idea behind summative scales, is that the mean (or total) of a set of imperfect repeated measures is a better operationalization of the the underlying construct that any of the individual items would be. The variance of items has three parts (a)the part that is in common with the what the other items are measuring, (b) the part the represents the information specific to the item, and (c)noise (aka error). PCA tries to account for both a and b. PAF tries to account for a. When you are creating scales you typically want to use varimax rotation in order to maximize discriminative validity. You want to look at the rotated factor loadings. These days it is good practice to use parallel analysis as the basis of deciding the ballpark in which to decide the number of factors to retain. https://people.ok.ubc.ca/brioconn/nfactors/nfactors.html YMMV but over the years since parallel analysis became available and in going back to factor analyses I did in the early 70's, I find that the final number of factors retained is about where the obtained eigenvalues are at least 1.00 (one item's worth of variance) more than what would occur in random data. [It would be interesting to hear from other members of this list about what the differnce was between the obtained eigenvalues and and those from the parallel analysis for the number of factor they finally retained.] ----- Please explain your situation in more detail. What constructs are you trying to create scales to represent? You say 2 scales but mention 4-5 components. What is your syntax? How many cases do you have? How were they selected? How many items do you have? What is their response scale? Did you put items that were supposed to represent 2 constructs in the same factor analysis? what do you mean by Given that there are many items on the scale - 40 items, two to three items make sense----- possible in the circumstance to arbitrarily move items to the components where they make the most conceptual sense. NO. Moving items between factors is directly contradictory to the purpose of using factor analysis in scale construction. You start out with a pool of items that you write to measure some constructs. Optimally items are balanced so represent each end of bipolar constructs or low versus high on unipolar constructs. You are guessing (hoping) that in the responses items will hang together (group, clump, etc.) in a way that represents the constructs. Some items will not "work", i.e., they will not load (hang with the other items) on a factor. Some will split. Some will load with too few items to make a scale. In the factor analysis, you find groups of items that hang together (i.e., seem to be measuring something that is common). If your have a solid analysis and those grouping are not consistent with what you anticipate, you need to reconsider your theorizing. Of course, first be sure that you have a solid analysis. Art Kendall Social Research ConsultantsOn 11/19/2013 1:18 PM, Promise [via SPSSX Discussion] wrote: I have ran factor analysis (PCA to be precise) for a new scale I am creating and the scree plot appears to suggest I retain 4-5 components on the two scales. However, the final scales don't make coherent conceptual sense. ...My main question is, is it possible in the circumstance to arbitrarily move items to the components where they make the most conceptual sense?
Art Kendall
Social Research Consultants |
In reply to this post by Rich Ulrich
Thanks guys.
My interest is item reduction so i'm using PCA. items rated on 5 point likert scale, 40 items on the instrument with sample size of 267 (256-267 across items in PCA). In the first iteration, I used varimax rotation but later used oblimin...correlations are quite good with many above .3 correlations. With a forced 3 components (using oblimin, on the assumption that items should correlate - but one component was inversely correlated at the end of the analysis, which suggests I could maintain the use of a varimax rotation?) the scale appears to be clearly 3 components, still the items don't bunch together in a way that makes conceptual sense. What's your advise on what should be done? |
What do you mean by
"item reduction"? Are you trying to shorten a scoring key from
a well established scale instrument? I.e., are you trying to
measure a construct with fewer items?
or do you have a new pool of items being considered as candidates to measure some constructs? How many constructs? What is the response scale of your items? (There is variability in the use of the term "Likert".) Are the item stems balanced for direction? If you are dealing with a scale why would you want to account for the item specific variance in addition to the common variance of the items? If I just wanted to shorten the scoring key, I would consider using "alpha if item deleted" in RELIABILITY trying to maintain the balance of items from the end of the construct. No mater what rotation you use, you do not want to include any item on more than one construct in the scoring key. Are the bunching derived from the rotated factor loadings? How do your eigenvalues compare to those from a parallel analysis? Since you have a relatively small set of cases, be sure to include that when you look at conceptual interpretation. After you consider the posts on this list, what syntax do you end up with? Art Kendall Social Research ConsultantsOn 11/20/2013 8:09 AM, Promise [via SPSSX Discussion] wrote: Thanks guys.
Art Kendall
Social Research Consultants |
In reply to this post by Promises
Yes, trying to measure a construct using fewer items...
|
But is it a new
instrument for which you have a pool of candidate items? Or are
you trying to use fewer items on a well established scale?
Art Kendall Social Research ConsultantsOn 11/20/2013 8:58 AM, Promise [via SPSSX Discussion] wrote: Yes, trying to measure a construct using fewer items...
Art Kendall
Social Research Consultants |
In reply to this post by Promises
it is a new instrument...
|
Administrator
|
In reply to this post by Art Kendall
AK: "When you are creating scales you typically want to use varimax rotation in order to maximize discriminative validity. You want to look at the rotated factor loadings."
I'm intrigued by Art's statement. First, I repeat my earlier disclaimer: I am not an expert on factor analysis. However, the Preacher et al article I uploaded earlier in this thread (http://spssx-discussion.1045642.n5.nabble.com/file/n5723166/Preacher_2003_Repairing_Tom_Swifts_electric_factor_analysis_machine.pdf) is very clear in saying that one should at least start with oblique rotation: "In general, if the researcher does not know how the factors are related to each other, there is no reason to assume that they are completely independent. It is almost always safer to assume that there is not perfect independence, and to use oblique rotation instead of orthogonal rotation.[8] Moreover, if optimal simple structure is exhibited by orthogonal factors, an obliquely rotated factor solution will resemble an orthogonal one anyway (Floyd & Widaman, 1995), so nothing is lost by using oblique rotation. Oblique rotation offers the further advantage of allowing estimation of factor correlations, which is surely a more informative approach than assuming that the factors are completely independent." Presumably if all of the correlations among the factors are insubstantial, one could then simplify the model a bit by reverting to an orthogonal rotation. It seems to me that use of an orthogonal rotation method when oblique rotation reveals substantial correlations among the factors is similar in spirit to running an ANCOVA model that forces the lines to be parallel when in fact they clearly are not parallel. In that case, one ought to include the Group x Covariate product term in the model, thus allowing the lines to depart from parallel. Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
It is true that a set
of items may have underlying dimensions that may be correlated
self-esteem, self-image, self-efficacy, etc. may be correlated
in the self-perception of the population. If one is
developing theory, the fact that different interpreted constructs
are correlated is not a problem.
note the third word "In general, if the researcher does not know how the factors are related to each other However, in developing scales to measure constructs, one wants an item to be used in only 1 scale on the scoring key. One specifically wants items to measure only one thing. In writing items the idea is to exclude "double barreled" items. If an item is included two or more places on the scoring key, that induces an artefactual correlation between measures. It can be particularly problematical when designing experiments to tease out theory by manipulations when the same item is on measures on both the independent variable side and the dependent variable side. For example, in the general area of liberalism conservatism different scales were giving conflicting results or weak results. Lorr found that the construct was overly broad. There were three parts, general liberal-conservatism, egalitarianism, and sexual freedom. Developing scales that measured these separately led to better formulation of hypotheses on social issues. When I used the scales in examining correlates of lib-con with elite perceptions of presidential candidate in the 1976 US Presidential election, I added items about then-current issues at the request of colleagues. It turned out that an item "Equal rights for gay people" was not a good candidate item for an updated scale because it split. Positions on social issue might be influenced by any of the dimensions separately. In my view it is sometimes important not to reduce a construct to a single variable. So for the general concept of lib-con I would use three variables. For the construct physical distance for many purposes I might need different multiple variable sets depending on the application. To look at a map I might use lat-long. to aim a communication laser I might need direction, distance, and declination. For planning health facilities, it might be more important to look travel time, travel cost, and subjective distance. For exploratory studies on social issues. I would use all three measures in the lib-con set. In a stepped (hierarchical steps not the nefarious stepwise step) regression they might be entered as a set. For an experiment in specific areas I might use only 1 of the measures depending on the substantive context. For an exploratory study on health resource use, I might want to include multiple measures of "distance": public transit time and expense, driving time, driving distance, subjective impressions of neighborhood safety, euclidean distance, Minkowski distance, etc. Art Kendall Social Research ConsultantsOn 11/20/2013 9:21 AM, Bruce Weaver [via SPSSX Discussion] wrote: AK: "When you are creating scales you typically want to use varimax rotation in order to maximize discriminative validity. You want to look at the rotated factor loadings."
Art Kendall
Social Research Consultants |
In reply to this post by Bruce Weaver
in addition, when many
items in a candidate pool are splitting in an orthogonal
solution that can often be a clue that the general
construct is in need of being re-thought.
In many ways that is much the same as finding that an non-orthogonal solution fits a set of items. Having a large candidate pool is a form of humility. It shows that we acknowledge that each of the items is an imperfect measure for two reasons: (1) its substantive nature and (2) the writer's less than perfect writing ability. Sometimes the solution to an item that splits or otherwise does not work out is to write it more clearly in the next draft of the data gathering instrument. Gee, developing theory, developing scales, and writing items have something in common with writing syntax! All four benefit by input from other people and a series of drafts. Art Kendall Social Research ConsultantsOn 11/20/2013 11:23 AM, Art Kendall wrote:
Art Kendall
Social Research Consultants |
In reply to this post by Promises
I figure that I am a better expert than the one that Bruce is citing,
and I never liked oblique rotations for rating scales. My earlier remarks still hold. If you are using oblique, set the parameter to allow a smaller amount of correlation if you see "too much overlap." However, your chief problem is that you are using PCA instead of PFA. PCA breaks things up too much in order to preserve "unique" variance. The sample size of 267 would be large enough if there were a well- defined structure, but "too few cases" could also contribute to the lack of apparent structure. The default will be that every case with a missing item is DROPPED, so the N in your analysis might be smaller than you think. (Does FACTOR still fail to tell us the N?) If PFA (iterating on the cummunalities) with Varimax does not give any semblance of distinct factors, the problem might be that the items were poorly worded or the Response-choices don't fit very well. Or the universe of items is not especially relevant to this sample: correlations reflect relationships /in a sample/. You do want to look at shared variance, not unique variance -- unless you know that you have some unique, important items that have little to do with the other items. But if that were the case, you probably should pull out those items (and figure a special way to use them) and factor the rest. If you are bothered by the "direction" of one factor, simply change the sign of all the loadings. That non-issue was the topic of a few posts here, a couple of weeks ago. -- Rich Ulrich > Date: Wed, 20 Nov 2013 05:09:49 -0800 > From: [hidden email] > Subject: Re: Factor analysis > To: [hidden email] > > Thanks guys. > My interest is item reduction so i'm using PCA. items rated on 5 point > likert scale, 40 items on the instrument with sample size of 267 (256-267 > across items in PCA). In the first iteration, I used varimax rotation but > later used oblimin...correlations are quite good with many above .3 > correlations. > With a forced 3 components (using oblimin, on the assumption that items > should correlate - but one component was inversely correlated at the end of > the analysis, which suggests I could maintain the use of a varimax > rotation?) the scale appears to be clearly 3 components, still the items > don't bunch together in a way that makes conceptual sense. > What's your advise on what should be done? > > |
Free forum by Nabble | Edit this page |