Factor Analysis--Senior Thesis Help

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Factor Analysis--Senior Thesis Help

nswillie
I am an undergraduate and working on my senior research project. I am doing a study of the construct validity of a critical thinking test using factor analysis. The test is 52 item multiple choice test (a, b, or c) with 6 subtests. According to the developer, the test measures 6 aspects of critical thinking. As the developer has a background in philosophy, it seems relevant to validate the test using quantitative methodology and determine whether the test actual measures 6 distinct constructs.

Here’s where I’m at. I have an extremely large sample size (N=1000) and have coded all of the individual responses of each case (a=1, b=2, c=3). I originally thought that examining the variances in the actual responses my reveal a factor structure, but now realize that since the test is not uniform (“a” does not mean the same thing throughout the test) that would be inappropriate.  Thus, I have transformed the responses to correct or incorrect (0=incorrect, 1=correct). It seems logical that if individuals were deficient in a particular area of the hypothesized critical thinking construct, then their incorrect response variances would emerge in a factor structure (and vice versa). Now however, I am having reservations about the appropriateness of conducting a factor analysis on binary data. I have searched many forums and sought much advice and seem to get a lot of contradictory information.  My advisor thinks that factor analysis is still the way to go, but I am not so sure. Any advice on this matter would be greatly appreciated.
Reply | Threaded
Open this post in threaded view
|

Re: Factor Analysis--Senior Thesis Help

Hector Maletta
You may want to see a previous exchange in this same forum, "RE: Factor
Analysis and dichotomous data", on October 31.
I transcribe an answer I gave in that exchange, where some colleague asked
whether factor analysis can be applied to dichotomous items, and whether
categorical factor analysis was better than classical factor analysis:

Hector Maletta wrote on October 30, 2008:
"First of all, for dichotomous data CATPCA and classical FA give the same
results. This is because CATPCA works by assigning optimum numerical values
to each category of categorical variables, but for a dichotomy any pair of
numerical values is equivalent to any other pair, because the variable has
only two possible values and thus only one interval will be ever observed."
[I add now to further clarify this point that Factor Analysis works on
standardized variables, i.e. on Z-scores, and thus the actual absolute
values are irrelevant. November 19, 2008].

"Second, you can compute linear correlation coefficients between dichotomous
variables, which can rigorously be treated as (discrete) interval-scale
variables. The phi association coefficient is equivalent to the linear
correlation coefficient when both variables are dichotomous. A matrix of
linear correlation coefficients is enough to compute a factor analysis
solution."
"Third, linear REGRESSION is not entirely appropriate for dichotomous data,
since predicted values can be fractional, and fall either within or without
the interval 0,1, while observed values can only be 0 or 1 [OR WHICHEVER
OTHER ABSOLUTE VALUES ARE USED. NOTE ADDED NOV 19, 2008]. Moreover,
residuals (actual values minus predicted values) would not usually have a
normal distribution around predicted values: if the predicted value is a
fraction between 0 and 1, such as 0.40, all observed values will be at the
extremes or "tails" of the residual distribution, at values 0 and 1, and no
observed value will be in the vicinity of the predicted value, i.e. the
residual distribution will be almost the exact opposite of a normal
distribution; since a normal distribution of residuals is an assumption of
linear regression, you may not sustain certain consequences or inferences of
linear regression if you use dichotomous predictors. By the way, this
affects the very common habit of using dummies, e.g. in econometrics. But
few people give this problem a second thought. Dummies are used everywhere
in regression, objectionable as this might be to purists. If you ever used a
split infinitive, you are daring enough to use dummies in regression."

Hope this helps.
Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
nswillie
Sent: 20 November 2008 15:53
To: [hidden email]
Subject: Factor Analysis--Senior Thesis Help

I am an undergraduate and working on my senior research project. I am doing
a
study of the construct validity of a critical thinking test using factor
analysis. The test is 52 item multiple choice test (a, b, or c) with 6
subtests. According to the developer, the test measures 6 aspects of
critical thinking. As the developer has a background in philosophy, it seems
relevant to validate the test using quantitative methodology and determine
whether the test actual measures 6 distinct constructs.

Here's where I'm at. I have an extremely large sample size (N=1000) and have
coded all of the individual responses of each case (a=1, b=2, c=3). I
originally thought that examining the variances in the actual responses my
reveal a factor structure, but now realize that since the test is not
uniform ("a" does not mean the same thing throughout the test) that would be
inappropriate.  Thus, I have transformed the responses to correct or
incorrect (0=incorrect, 1=correct). It seems logical that if individuals
were deficient in a particular area of the hypothesized critical thinking
construct, then their incorrect response variances would emerge in a factor
structure (and vice versa). Now however, I am having reservations about the
appropriateness of conducting a factor analysis on binary data. I have
searched many forums and sought much advice and seem to get a lot of
contradictory information.  My advisor thinks that factor analysis is still
the way to go, but I am not so sure. Any advice on this matter would be
greatly appreciated.

--
View this message in context:
http://www.nabble.com/Factor-Analysis--Senior-Thesis-Help-tp20606710p2060671
0.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Factor Analysis--Senior Thesis Help

Art Kendall
In reply to this post by nswillie
If you have the AMOS module, you would do better to do a confirmatory
factor analysis.  If you do not have AMOS, scales are typically factored
only for the common variance.  Most likely, you will do well doing the
very conventional FA, i.e, with varimax rotation.  If items load
cleanly, then the constructs have divergent validity.
You will most likely need your advisor's help at choosing the number of
items to retain.

If you want to delve into FA a little deeper look up "parallel analysis"
in the archives of this list.  YMMV but  my experience is that typically
I retain no more than the number of factors where the obtained
eigenvalue is 1.0 more than the eigenvalues from the random data.

If anticipate continuing in psychology or education, save your data, so
that later when you get further into item analysis you can get into
other ways to look at the scales such as quality of distractors, and
difficulty, Rasch Modeling, etc.

Art Kendall
Social Research Consultants

nswillie wrote:

> I am an undergraduate and working on my senior research project. I am doing a
> study of the construct validity of a critical thinking test using factor
> analysis. The test is 52 item multiple choice test (a, b, or c) with 6
> subtests. According to the developer, the test measures 6 aspects of
> critical thinking. As the developer has a background in philosophy, it seems
> relevant to validate the test using quantitative methodology and determine
> whether the test actual measures 6 distinct constructs.
>
> Here’s where I’m at. I have an extremely large sample size (N=1000) and have
> coded all of the individual responses of each case (a=1, b=2, c=3). I
> originally thought that examining the variances in the actual responses my
> reveal a factor structure, but now realize that since the test is not
> uniform (“a” does not mean the same thing throughout the test) that would be
> inappropriate.  Thus, I have transformed the responses to correct or
> incorrect (0=incorrect, 1=correct). It seems logical that if individuals
> were deficient in a particular area of the hypothesized critical thinking
> construct, then their incorrect response variances would emerge in a factor
> structure (and vice versa). Now however, I am having reservations about the
> appropriateness of conducting a factor analysis on binary data. I have
> searched many forums and sought much advice and seem to get a lot of
> contradictory information.  My advisor thinks that factor analysis is still
> the way to go, but I am not so sure. Any advice on this matter would be
> greatly appreciated.
>
> --
> View this message in context: http://www.nabble.com/Factor-Analysis--Senior-Thesis-Help-tp20606710p20606710.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

How to save Neural Network models as SQL / SPSS syntax

Spousta Jan
Hi all,

In SPSS 17, if I use a TREE command, I can save the resulting model in the form of SQL/SPSS program, for example:

TREE dependent BY independents ...
 /RULES NODES=TERMINAL SYNTAX=SQL TYPE=SCORING OUTFILE='C:\model.sql' ...

I need it in order to be able to scale fresh data later in a database on a main server. (I build models locally but deploy them in another software.)

The TREE is good; and also with "normal" regression models, I am able to re-construct the rules easily from the table of regression coeffs. The problem is with another types of models, for example the Radial Basis Function Network: The only thing I can do is

RBF dependent  WITH independents ...
 /OUTFILE MODEL='C:\model.xml'...

But how to create SQL/SPSS rules from the *.xml file easily? Has somebody a solution, that is a program creating rules from xml model files? Does SPSS Inc. plan to enhance the functionality in future releases?

Best,

Jan





_____________
Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem.

Jste si jisti, že opravdu potřebujete vytisknout tuto zprávu a/nebo její přílohy? Myslete na přírodu.


This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission.

Are you sure that you really need a print version of this message and/or its attachments? Think about nature.

-.- --

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD