SPSSX Discussion

Question about the linearity assumption for Discriminant Analysis

Classic

List

Threaded

4 messages Options

TomSnider

Jan 04, 2012; 11:57pm

Question about the linearity assumption for Discriminant Analysis

I'd appreciate an clarification of the linearity assumption for discriminant analysis. I am told that the model requires a linear relationship among the predictor variables within each group. At the same time, it would seem that we would prefer to reduce redundancy among the predictors -- i.e., have them be unrelated. I can't seem to reconcile these two demands. Could someone help?

Tricia Cross

Jan 04, 2012; 11:59pm

Automatic reply: Question about the linearity assumption for Discriminant Analysis

I will be out of the office January 5th. I will be returning the morning of Jan 6th. I will not have access to email or voicemail. I will respond to your message when I return. If you need immediate assistance, please contact Ross Wohlert at 860-676-3677.

CONFIDENTIALITY NOTICE: This communication contains information
intended for the use of the individuals to whom it is addressed
and may contain information that is privileged, confidential or
exempt from other disclosure under applicable law. If you are
not the intended recipient, you are notified that any disclosure,
printing, copying, distribution or use of the contents is prohibited.
If you have received this in error, please notify the sender
immediately by telephone or by returning it by return mail and then
permanently delete the communication from your system. Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Poes, Matthew Joseph

Jan 05, 2012; 4:57pm

Re: Question about the linearity assumption for Discriminant Analysis

In reply to this post by TomSnider

I believe this is due to a misunderstanding of what these two terms are dealing with, and what a DFA actually is. DFA can be thought of as an upside down ANOVA, it is an extension of the GLM. In an anova, you take a set of linear variables, and make a sample means comparison between groups, across those variables. The variables need not be linearly related to each other, they simply need to be linear continuous variables. This means normally distributed, equal error variance, etc. You can't use standard DFA to discriminate groups via categorical variables, even if they are ordinal, because it would break this assumption, just like it does for ANOVA.

Variables can also be linearly related within a DFA as well, this would just mean they have common variance. It may or may not be associated with their ability to help discriminate groups. To be honest, this is unimportant, other than, as you say, you want to have variables which are not overly highly correlated, but which have some meaningful reason to be in the DFA.

The end interpretation, as you probably know, is that you take a set of variables, which you believe can be used to discriminate amongst these groups, and are then able to quantify this ability in terms of their relative discriminant effect.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of TomSnider
Sent: Wednesday, January 04, 2012 5:58 PM
To: [hidden email]
Subject: Question about the linearity assumption for Discriminant Analysis

I'd appreciate an clarification of the linearity assumption for discriminant analysis. I am told that the model requires a linear relationship among the predictor variables within each group. At the same time, it would seem that we would prefer to reduce redundancy among the predictors -- i.e., have them be unrelated. I can't seem to reconcile these two demands. Could someone help?

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-the-linearity-assumption-for-Discriminant-Analysis-tp5121318p5121318.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Rich Ulrich

Jan 05, 2012; 6:20pm

Re: Question about the linearity assumption for Discriminant Analysis

In reply to this post by TomSnider

What we are creating with discriminant function (or regression, or logistic
regression) is a linear combination of variables to make a predictor equation.
Or several predictor equations, if there are several groups as criteria.

For the result to meet the assumptions for valid testing at the end, the
residuals of prediction must be homogeneous. - I don't know where you
are reading about a "linearity assumption" for DF, but I would usually
think about it more as an assumption of homogeneity of residuals.
If each variable has similar variance across groups, then their scaling
looks "right" and linear for predicting membership in those groups.

You are correct in saying that we prefer to reduce redundancy among
predictors. I can say that we should much prefer *linear* relations
between the predictors in contrast to accidental, *nonlinear* relations.

When we intentionally put in non-linear relations (X, X-squared, X-cubed),
we (should) know that we have to be careful in our interpretation, because
it is all the same variable. (Problems are smaller when those contrasts
are designed to be orthogonal or nearly so, by subtracting the mean (say)
before squaring.) But when two predictors, W and X, happen to have a
non-linear relationship, we are entering too *blindly* into the
X, X-squared as predictors,
highly correlated, highly confounded,
artifacts-expected-from-poor-scaling
sort of paradigm. Even if the residuals end up homogeneous, we have a
worse case for interpretation than if we started with X and X-squared.

So - though I do see some sense in referring to a linearity assumption
for predictors in DF, I would usually choose to talk about it differently,
or with some larger amount of detail attached.

--
Rich Ulrich

> Date: Wed, 4 Jan 2012 15:57:44 -0800

> From: [hidden email]
> Subject: Question about the linearity assumption for Discriminant Analysis
> To: [hidden email]
>
> I'd appreciate an clarification of the linearity assumption for discriminant
> analysis. I am told that the model requires a linear relationship among the
> predictor variables within each group. At the same time, it would seem that
> we would prefer to reduce redundancy among the predictors -- i.e., have them
> be unrelated. I can't seem to reconcile these two demands. Could someone
> help?
>

[...]