|
I am an undergraduate and working on my senior research project. I am doing a study of the construct validity of a critical thinking test using factor analysis. The test is 52 item multiple choice test (a, b, or c) with 6 subtests. According to the developer, the test measures 6 aspects of critical thinking. As the developer has a background in philosophy, it seems relevant to validate the test using quantitative methodology and determine whether the test actual measures 6 distinct constructs.
Here’s where I’m at. I have an extremely large sample size (N=1000) and have coded all of the individual responses of each case (a=1, b=2, c=3). I originally thought that examining the variances in the actual responses my reveal a factor structure, but now realize that since the test is not uniform (“a” does not mean the same thing throughout the test) that would be inappropriate. Thus, I have transformed the responses to correct or incorrect (0=incorrect, 1=correct). It seems logical that if individuals were deficient in a particular area of the hypothesized critical thinking construct, then their incorrect response variances would emerge in a factor structure (and vice versa). Now however, I am having reservations about the appropriateness of conducting a factor analysis on binary data. I have searched many forums and sought much advice and seem to get a lot of contradictory information. My advisor thinks that factor analysis is still the way to go, but I am not so sure. Any advice on this matter would be greatly appreciated. |
|
You may want to see a previous exchange in this same forum, "RE: Factor
Analysis and dichotomous data", on October 31. I transcribe an answer I gave in that exchange, where some colleague asked whether factor analysis can be applied to dichotomous items, and whether categorical factor analysis was better than classical factor analysis: Hector Maletta wrote on October 30, 2008: "First of all, for dichotomous data CATPCA and classical FA give the same results. This is because CATPCA works by assigning optimum numerical values to each category of categorical variables, but for a dichotomy any pair of numerical values is equivalent to any other pair, because the variable has only two possible values and thus only one interval will be ever observed." [I add now to further clarify this point that Factor Analysis works on standardized variables, i.e. on Z-scores, and thus the actual absolute values are irrelevant. November 19, 2008]. "Second, you can compute linear correlation coefficients between dichotomous variables, which can rigorously be treated as (discrete) interval-scale variables. The phi association coefficient is equivalent to the linear correlation coefficient when both variables are dichotomous. A matrix of linear correlation coefficients is enough to compute a factor analysis solution." "Third, linear REGRESSION is not entirely appropriate for dichotomous data, since predicted values can be fractional, and fall either within or without the interval 0,1, while observed values can only be 0 or 1 [OR WHICHEVER OTHER ABSOLUTE VALUES ARE USED. NOTE ADDED NOV 19, 2008]. Moreover, residuals (actual values minus predicted values) would not usually have a normal distribution around predicted values: if the predicted value is a fraction between 0 and 1, such as 0.40, all observed values will be at the extremes or "tails" of the residual distribution, at values 0 and 1, and no observed value will be in the vicinity of the predicted value, i.e. the residual distribution will be almost the exact opposite of a normal distribution; since a normal distribution of residuals is an assumption of linear regression, you may not sustain certain consequences or inferences of linear regression if you use dichotomous predictors. By the way, this affects the very common habit of using dummies, e.g. in econometrics. But few people give this problem a second thought. Dummies are used everywhere in regression, objectionable as this might be to purists. If you ever used a split infinitive, you are daring enough to use dummies in regression." Hope this helps. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of nswillie Sent: 20 November 2008 15:53 To: [hidden email] Subject: Factor Analysis--Senior Thesis Help I am an undergraduate and working on my senior research project. I am doing a study of the construct validity of a critical thinking test using factor analysis. The test is 52 item multiple choice test (a, b, or c) with 6 subtests. According to the developer, the test measures 6 aspects of critical thinking. As the developer has a background in philosophy, it seems relevant to validate the test using quantitative methodology and determine whether the test actual measures 6 distinct constructs. Here's where I'm at. I have an extremely large sample size (N=1000) and have coded all of the individual responses of each case (a=1, b=2, c=3). I originally thought that examining the variances in the actual responses my reveal a factor structure, but now realize that since the test is not uniform ("a" does not mean the same thing throughout the test) that would be inappropriate. Thus, I have transformed the responses to correct or incorrect (0=incorrect, 1=correct). It seems logical that if individuals were deficient in a particular area of the hypothesized critical thinking construct, then their incorrect response variances would emerge in a factor structure (and vice versa). Now however, I am having reservations about the appropriateness of conducting a factor analysis on binary data. I have searched many forums and sought much advice and seem to get a lot of contradictory information. My advisor thinks that factor analysis is still the way to go, but I am not so sure. Any advice on this matter would be greatly appreciated. -- View this message in context: http://www.nabble.com/Factor-Analysis--Senior-Thesis-Help-tp20606710p2060671 0.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by nswillie
If you have the AMOS module, you would do better to do a confirmatory
factor analysis. If you do not have AMOS, scales are typically factored only for the common variance. Most likely, you will do well doing the very conventional FA, i.e, with varimax rotation. If items load cleanly, then the constructs have divergent validity. You will most likely need your advisor's help at choosing the number of items to retain. If you want to delve into FA a little deeper look up "parallel analysis" in the archives of this list. YMMV but my experience is that typically I retain no more than the number of factors where the obtained eigenvalue is 1.0 more than the eigenvalues from the random data. If anticipate continuing in psychology or education, save your data, so that later when you get further into item analysis you can get into other ways to look at the scales such as quality of distractors, and difficulty, Rasch Modeling, etc. Art Kendall Social Research Consultants nswillie wrote: > I am an undergraduate and working on my senior research project. I am doing a > study of the construct validity of a critical thinking test using factor > analysis. The test is 52 item multiple choice test (a, b, or c) with 6 > subtests. According to the developer, the test measures 6 aspects of > critical thinking. As the developer has a background in philosophy, it seems > relevant to validate the test using quantitative methodology and determine > whether the test actual measures 6 distinct constructs. > > Here’s where I’m at. I have an extremely large sample size (N=1000) and have > coded all of the individual responses of each case (a=1, b=2, c=3). I > originally thought that examining the variances in the actual responses my > reveal a factor structure, but now realize that since the test is not > uniform (“a” does not mean the same thing throughout the test) that would be > inappropriate. Thus, I have transformed the responses to correct or > incorrect (0=incorrect, 1=correct). It seems logical that if individuals > were deficient in a particular area of the hypothesized critical thinking > construct, then their incorrect response variances would emerge in a factor > structure (and vice versa). Now however, I am having reservations about the > appropriateness of conducting a factor analysis on binary data. I have > searched many forums and sought much advice and seem to get a lot of > contradictory information. My advisor thinks that factor analysis is still > the way to go, but I am not so sure. Any advice on this matter would be > greatly appreciated. > > -- > View this message in context: http://www.nabble.com/Factor-Analysis--Senior-Thesis-Help-tp20606710p20606710.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
|
Hi all,
In SPSS 17, if I use a TREE command, I can save the resulting model in the form of SQL/SPSS program, for example: TREE dependent BY independents ... /RULES NODES=TERMINAL SYNTAX=SQL TYPE=SCORING OUTFILE='C:\model.sql' ... I need it in order to be able to scale fresh data later in a database on a main server. (I build models locally but deploy them in another software.) The TREE is good; and also with "normal" regression models, I am able to re-construct the rules easily from the table of regression coeffs. The problem is with another types of models, for example the Radial Basis Function Network: The only thing I can do is RBF dependent WITH independents ... /OUTFILE MODEL='C:\model.xml'... But how to create SQL/SPSS rules from the *.xml file easily? Has somebody a solution, that is a program creating rules from xml model files? Does SPSS Inc. plan to enhance the functionality in future releases? Best, Jan _____________ Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. Jste si jisti, že opravdu potřebujete vytisknout tuto zprávu a/nebo její přílohy? Myslete na přírodu. This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission. Are you sure that you really need a print version of this message and/or its attachments? Think about nature. -.- -- ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
