I'd appreciate an clarification of the linearity assumption for discriminant analysis. I am told that the model requires a linear relationship among the predictor variables within each group. At the same time, it would seem that we would prefer to reduce redundancy among the predictors -- i.e., have them be unrelated. I can't seem to reconcile these two demands. Could someone help?
|
I will be out of the office January 5th. I will be returning the morning of Jan 6th. I will not have access to email or voicemail. I will respond to your message when I return. If you need immediate assistance, please contact Ross Wohlert at 860-676-3677.
CONFIDENTIALITY NOTICE: This communication contains information intended for the use of the individuals to whom it is addressed and may contain information that is privileged, confidential or exempt from other disclosure under applicable law. If you are not the intended recipient, you are notified that any disclosure, printing, copying, distribution or use of the contents is prohibited. If you have received this in error, please notify the sender immediately by telephone or by returning it by return mail and then permanently delete the communication from your system. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by TomSnider
I believe this is due to a misunderstanding of what these two terms are dealing with, and what a DFA actually is. DFA can be thought of as an upside down ANOVA, it is an extension of the GLM. In an anova, you take a set of linear variables, and make a sample means comparison between groups, across those variables. The variables need not be linearly related to each other, they simply need to be linear continuous variables. This means normally distributed, equal error variance, etc. You can't use standard DFA to discriminate groups via categorical variables, even if they are ordinal, because it would break this assumption, just like it does for ANOVA.
Variables can also be linearly related within a DFA as well, this would just mean they have common variance. It may or may not be associated with their ability to help discriminate groups. To be honest, this is unimportant, other than, as you say, you want to have variables which are not overly highly correlated, but which have some meaningful reason to be in the DFA. The end interpretation, as you probably know, is that you take a set of variables, which you believe can be used to discriminate amongst these groups, and are then able to quantify this ability in terms of their relative discriminant effect. Matthew J Poes Research Data Specialist Center for Prevention Research and Development University of Illinois -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of TomSnider Sent: Wednesday, January 04, 2012 5:58 PM To: [hidden email] Subject: Question about the linearity assumption for Discriminant Analysis I'd appreciate an clarification of the linearity assumption for discriminant analysis. I am told that the model requires a linear relationship among the predictor variables within each group. At the same time, it would seem that we would prefer to reduce redundancy among the predictors -- i.e., have them be unrelated. I can't seem to reconcile these two demands. Could someone help? -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-the-linearity-assumption-for-Discriminant-Analysis-tp5121318p5121318.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by TomSnider
What we are creating with discriminant function (or regression, or logistic
regression) is a linear combination of variables to make a predictor equation. Or several predictor equations, if there are several groups as criteria. For the result to meet the assumptions for valid testing at the end, the residuals of prediction must be homogeneous. - I don't know where you are reading about a "linearity assumption" for DF, but I would usually think about it more as an assumption of homogeneity of residuals. If each variable has similar variance across groups, then their scaling looks "right" and linear for predicting membership in those groups. You are correct in saying that we prefer to reduce redundancy among predictors. I can say that we should much prefer *linear* relations between the predictors in contrast to accidental, *nonlinear* relations. When we intentionally put in non-linear relations (X, X-squared, X-cubed), we (should) know that we have to be careful in our interpretation, because it is all the same variable. (Problems are smaller when those contrasts are designed to be orthogonal or nearly so, by subtracting the mean (say) before squaring.) But when two predictors, W and X, happen to have a non-linear relationship, we are entering too *blindly* into the X, X-squared as predictors, highly correlated, highly confounded, artifacts-expected-from-poor-scaling sort of paradigm. Even if the residuals end up homogeneous, we have a worse case for interpretation than if we started with X and X-squared. So - though I do see some sense in referring to a linearity assumption for predictors in DF, I would usually choose to talk about it differently, or with some larger amount of detail attached. -- Rich Ulrich > Date: Wed, 4 Jan 2012 15:57:44 -0800 > From: [hidden email] > Subject: Question about the linearity assumption for Discriminant Analysis > To: [hidden email] > > I'd appreciate an clarification of the linearity assumption for discriminant > analysis. I am told that the model requires a linear relationship among the > predictor variables within each group. At the same time, it would seem that > we would prefer to reduce redundancy among the predictors -- i.e., have them > be unrelated. I can't seem to reconcile these two demands. Could someone > help? > |
Free forum by Nabble | Edit this page |