Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit X-Scanned-By: Digested by UGA Mail Gateway on 128.192.1.75 Hi , I should like to post once more my question concerning Fisher s exact for bigger rables than 2x2 . I learnt, that in a 2x2 table, for Fisher ' s exact test , I calculate the p-value directly from the table and there is no test statistic. Also in SPSS , in the 2x2 case, there is no test statistic cited. But if I have a r x k table with r or k >2 , with the Exact Tests option I get a test statistic. without degrees of freedom. My question is, what is the distribution of it and/ or how is it called? best regards Monika |
Dear list,
I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ____________________________________________________________________________________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
In reply to this post by Monika Heinzel-Gutenbrunner-2
How about: the Fisher-Freeman-Halton
test of independence for an unordered RxC table. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Monika Heinzel-Gutenbrunner Sent: Thursday, February 01, 2007 3:37 AM To: [hidden email] Subject: again, more precise: Fisher s exact test for bigger tables than 2x2 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: Digested by UGA Mail Gateway on 128.192.1.75 Hi , I should like to post once more my question concerning Fisher s exact for bigger rables than 2x2 . I learnt, that in a 2x2 table, for Fisher ' s exact test , I calculate the p-value directly from the table and there is no test statistic. Also in SPSS , in the 2x2 case, there is no test statistic cited. But if I have a r x k table with r or k >2 , with the Exact Tests option I get a test statistic. without degrees of freedom. My question is, what is the distribution of it and/ or how is it called? best regards Monika |
In reply to this post by Albert-Jan Roskam
Albert-jan,
In my school days, I spent a lot of time studying Econometrics and multicollinearity is a common topic. Referring back to one of my old text books (Basic Econometrics from Damodar Gujarati), some methods for detecting the presence of multicollinearity are: 1. Regression results with high R^2 but few significant t-ratios. 2. High pairwise correlations among regressors (you Spearman correlation coefficients). In this book, they say the threshold is 0.8 but give no source. 3. Auxiliary regressions -- regress each of your independent variables on the other independent variables and look at the resulting R^2 for each. According to something known as Klien's rule of thumb, if the R^2 of any auxiliary regression is greater than the R^2 of the main regression, then you should assume there's multicollinearity. 4. Compute the eigenvalues for the model you are using and find what is known as the condition index. It is defined as SQRT(max eigenvalue/min eigenvalue). If the condition index is between 100 and 1,000, there is moderate to strong multicollinearity; if it exceeds 1,000, then there is strong multicollinearity. 5. Tolerance and variance inflation factors (VIF). The VIF is computed as 1/(1-rhoij), where rhoij is the correlation coefficient between independent variables i and j. The VIF shows how the variance of an estimator is affected by the presence of multicollinearity, and the greater the rho between 2 variables, the greater the impact on the variance (and therefore, standard errors around the coefficient). Another rule of thumb is that if the VIF is 10 or more (meaning and R^2ij or 0.9), you have a problem. Tolerance is defined as 1/VIF (=1-R^2ij), so a value of 0 means perfect multicollinearity and 1 means no multicollinearity. What can you do if there is multicollinearity? Well, it depends on your model and data. Given that you're using ordinal data, some of the recommendations won't apply. Is there some relationship between your variables that you can take advantage of? For example, if you know that X1 and X2 are related in some manner based on theory or previous empirical work, then you can modify your model accordingly. You can also drop one of the offending variables but at the risk of specification error. Also, multicollinearity is a feature of samples, so is it possible to get another sample from the same population? Additional or new data may help if it is possible to obtain it. If you want to delve further, you can also try factor analysis, principal components analysis or ridge regression. I hope some of this helps. Bruno Berszoner Tufts Health Plan Quality and Health Informatics (617) 923-5868 x4393 "Albert-jan Roskam" <[hidden email]> To Sent by: [hidden email] "SPSSX(r) cc Discussion" <SPSSX-L@LISTSERV Subject .UGA.EDU> Collinearity 02/01/2007 06:38 AM Please respond to "Albert-jan Roskam" <[hidden email]> Dear list, I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ____________________________________________________________________________________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
In reply to this post by Albert-Jan Roskam
I am not aware that Spearman's coefficients were used to diagnose
collinearity but given that they are correlation coefficients they could be used. A correlation matrix in general will have its coefficients between -1 and 1, the closer to one the more correlated the variables are. There are other rules such as the variance inflation factors and the variance proportion coefficients. The former shall not exceed a value of 10 while the latter shall not exceed .5 and no more than three variables should have those values (.5) along the same row. Since I am new to SPSS I am not aware that these measures of collinearity are available in it. Fermin Ornelas, Ph.D. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Albert-jan Roskam Sent: Thursday, February 01, 2007 4:39 AM To: [hidden email] Subject: Collinearity Dear list, I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ________________________________________________________________________ ____________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed. It may contain information that is privileged and confidential under state and federal law. This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail. Thank you. |
In reply to this post by Albert-Jan Roskam
If you have SPSS Categories, you can use CATREG for regression, using ordinal scaling level for the predictors (and numerical level for a continuous dependent variable). CATREG gives the tolerance for the quantified ordinal variables. The quantified variables can be saved and used with logistic regression.
Anita van der Kooij Data Theory Group Leiden University ________________________________ From: SPSSX(r) Discussion on behalf of Albert-jan Roskam Sent: Thu 01/02/2007 12:38 To: [hidden email] Subject: Collinearity Dear list, I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ____________________________________________________________________________________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** |
In reply to this post by Albert-Jan Roskam
Stephen Brand
www.statisticsdoc.com Albert-jan, A great deal of good advice has been given on this topic, particularly Anita's suggestion to utilize CATREG. Just to add a couple of small items to the pool, I would suggest the following: (1) Perfect collinearity exists when one independent variable can be predicted by a linear combination of the other independent variables, so in addition to looking at the bivariate correlations between the predictors, examine the multiple regression between each predictor and the other predictors (e.g., to what extent can X1 be predicted by a weighted combination of X2 and X3). (2) If you have a large sample, you might want to consider splitting it randomly into halves, and conducting the logistic regression analysis in both halves, or cross-validating the regression weights from one half in the other half. This approach will give some indication of how robust the parameter estimates are. HTH, Stephen Brand For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Albert-jan Roskam Sent: Thursday, February 01, 2007 6:39 AM To: [hidden email] Subject: Collinearity Dear list, I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ____________________________________________________________________________ ________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
In reply to this post by Bruno Berszoner
At 08:32 AM 2/1/2007, you wrote:
>Also, multicollinearity is a feature of samples, so is it possible to get >another sample from the same population? Additional or new data may help >if it is possible to obtain it. >Bruno Berszoner ...this one is slightly new to me. I assume that drawing another sample would only prove useful in the event that you could oversample certain characteristics to reduce the degree of multicollinearity and then weight the final analysis to provide the equivalent of a randomized sampling approach? E.g., if race and income were collinear, then you would need to over-sample high income minority groups to decrease the collinearity between these two variables. ...or is there another way that this might be expected to work? Jeff |
In reply to this post by statisticsdoc
I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome.
As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. My two cents. Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc Sent: Thursday, February 01, 2007 8:57 PM To: [hidden email] Subject: Re: [SPSSX-L] Collinearity Stephen Brand www.statisticsdoc.com Albert-jan, A great deal of good advice has been given on this topic, particularly Anita's suggestion to utilize CATREG. Just to add a couple of small items to the pool, I would suggest the following: (1) Perfect collinearity exists when one independent variable can be predicted by a linear combination of the other independent variables, so in addition to looking at the bivariate correlations between the predictors, examine the multiple regression between each predictor and the other predictors (e.g., to what extent can X1 be predicted by a weighted combination of X2 and X3). (2) If you have a large sample, you might want to consider splitting it randomly into halves, and conducting the logistic regression analysis in both halves, or cross-validating the regression weights from one half in the other half. This approach will give some indication of how robust the parameter estimates are. HTH, Stephen Brand For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Albert-jan Roskam Sent: Thursday, February 01, 2007 6:39 AM To: [hidden email] Subject: Collinearity Dear list, I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ____________________________________________________________________________ ________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
In reply to this post by Albert-Jan Roskam
There is another issue aside collinearity. If the final model is a
logistic one whose purpose might be prediction then collinearity may not be a big issue as long as development and validation results appear reliable. In regular OLS regression with degrading collinearity hypothesis testing is not valid. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc Sent: Thursday, February 01, 2007 7:57 PM To: [hidden email] Subject: Re: Collinearity Stephen Brand www.statisticsdoc.com Albert-jan, A great deal of good advice has been given on this topic, particularly Anita's suggestion to utilize CATREG. Just to add a couple of small items to the pool, I would suggest the following: (1) Perfect collinearity exists when one independent variable can be predicted by a linear combination of the other independent variables, so in addition to looking at the bivariate correlations between the predictors, examine the multiple regression between each predictor and the other predictors (e.g., to what extent can X1 be predicted by a weighted combination of X2 and X3). (2) If you have a large sample, you might want to consider splitting it randomly into halves, and conducting the logistic regression analysis in both halves, or cross-validating the regression weights from one half in the other half. This approach will give some indication of how robust the parameter estimates are. HTH, Stephen Brand For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Albert-jan Roskam Sent: Thursday, February 01, 2007 6:39 AM To: [hidden email] Subject: Collinearity Dear list, I would like to test for collinearity between three ordinal variables. The variables have different numbers of values, but are coded in a similar way, i.e. category 1 is the lowest category for all three vars. I calculated Spearman's rho correlations for these variables. The correlation coefficient never exceeds .53; well below the generally used rule-of-thumb that it should not exceed .85. --btw, does anybody have a good reference for this rule? Can I now safely assume that my variables are not collinear when I use them simultaneously as independent predictors in a logistic regression analysis? Thank you for your replies! Albert-Jan ________________________________________________________________________ ____ ________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed. It may contain information that is privileged and confidential under state and federal law. This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail. Thank you. |
In reply to this post by Albert-Jan Roskam
Jon,
Good point. I think most of those who have posted on this topic would agree that collinearity is a matter of degree, not an either/or condition. Perhaps a better way to phrase the initial question in this topic is "How do I assess the magnitude of collinearity among my predictors?" This a particularly interesting topic with respect to logistic regression. Best, Stephen Brand ---- "Peck wrote: > I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome. > > As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. > > One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. > > My two cents. > > Jon Peck > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc > Sent: Thursday, February 01, 2007 8:57 PM > To: [hidden email] > Subject: Re: [SPSSX-L] Collinearity > > Stephen Brand > www.statisticsdoc.com > > Albert-jan, > > A great deal of good advice has been given on this topic, particularly > Anita's suggestion to utilize CATREG. Just to add a couple of small items > to the pool, I would suggest the following: > > (1) Perfect collinearity exists when one independent variable can be > predicted by a linear combination of the other independent variables, so in > addition to looking at the bivariate correlations between the predictors, > examine the multiple regression between each predictor and the other > predictors (e.g., to what extent can X1 be predicted by a weighted > combination of X2 and X3). > > (2) If you have a large sample, you might want to consider splitting it > randomly into halves, and conducting the logistic regression analysis in > both halves, or cross-validating the regression weights from one half in the > other half. This approach will give some indication of how robust the > parameter estimates are. > > HTH, > > Stephen Brand > > For personalized and professional consultation in statistics and research > design, visit > www.statisticsdoc.com > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of > Albert-jan Roskam > Sent: Thursday, February 01, 2007 6:39 AM > To: [hidden email] > Subject: Collinearity > > > Dear list, > > I would like to test for collinearity between three > ordinal variables. The variables have different > numbers of values, but are coded in a similar way, > i.e. category 1 is the lowest category for all three > vars. > > I calculated Spearman's rho correlations for these > variables. The correlation coefficient never exceeds > .53; well below the generally used rule-of-thumb that > it should not exceed .85. --btw, does anybody have a > good reference for this rule? > > Can I now safely assume that my variables are not > collinear when I use them simultaneously as > independent predictors in a logistic regression > analysis? > > Thank you for your replies! > > Albert-Jan > > > > ____________________________________________________________________________ > ________ > Now that's room service! Choose from over 150,000 hotels > in 45,000 destinations on Yahoo! Travel to find your fit. > http://farechase.yahoo.com/promo-generic-14795097 -- For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com |
In reply to this post by Peck, Jon
Depending on the software you are using, you might get a "rude" message
saying the matrix of x's is singular, perfectly collinear, the matrix cannot be inverted, or equivalently that the determinant is zero and no further information. The most common human error causes of this are to enter the same variable twice, to enter the complete set of dummy variables that represent a categorical variable, to enter subtotals and grand totals, items and total scores, etc. A very quick and dirty way to locate which variables are involved in the problem is to pretend that all of the x variables are items in a scale and run RELIABILITY. This procedure shows you the SMC - squared multiple correlation- of each variable with the other variables. It also shows you the corrected item-total correlation, the correlation of each item with the sum of the other items. Items that have SMCs (R**2s) of 1.00 have perfect redundancy. The column of SMCs shows the fit of all possible regressions of each variable in the set with all other variables in the set. The SMCs tells you the degree to which variables are collinear (redundant) with the other variables in the set. Which variable(s) to drop from the set will depend on the substantive nature of your analysis. Art Kendall Social Research Consultants Peck, Jon wrote: > I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome. > > As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. > > One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. > > My two cents. > > Jon Peck > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc > Sent: Thursday, February 01, 2007 8:57 PM > To: [hidden email] > Subject: Re: [SPSSX-L] Collinearity > > Stephen Brand > www.statisticsdoc.com > > Albert-jan, > > A great deal of good advice has been given on this topic, particularly > Anita's suggestion to utilize CATREG. Just to add a couple of small items > to the pool, I would suggest the following: > > (1) Perfect collinearity exists when one independent variable can be > predicted by a linear combination of the other independent variables, so in > addition to looking at the bivariate correlations between the predictors, > examine the multiple regression between each predictor and the other > predictors (e.g., to what extent can X1 be predicted by a weighted > combination of X2 and X3). > > (2) If you have a large sample, you might want to consider splitting it > randomly into halves, and conducting the logistic regression analysis in > both halves, or cross-validating the regression weights from one half in the > other half. This approach will give some indication of how robust the > parameter estimates are. > > HTH, > > Stephen Brand > > For personalized and professional consultation in statistics and research > design, visit > www.statisticsdoc.com > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of > Albert-jan Roskam > Sent: Thursday, February 01, 2007 6:39 AM > To: [hidden email] > Subject: Collinearity > > > Dear list, > > I would like to test for collinearity between three > ordinal variables. The variables have different > numbers of values, but are coded in a similar way, > i.e. category 1 is the lowest category for all three > vars. > > I calculated Spearman's rho correlations for these > variables. The correlation coefficient never exceeds > .53; well below the generally used rule-of-thumb that > it should not exceed .85. --btw, does anybody have a > good reference for this rule? > > Can I now safely assume that my variables are not > collinear when I use them simultaneously as > independent predictors in a logistic regression > analysis? > > Thank you for your replies! > > Albert-Jan > > > > ____________________________________________________________________________ > ________ > Now that's room service! Choose from over 150,000 hotels > in 45,000 destinations on Yahoo! Travel to find your fit. > http://farechase.yahoo.com/promo-generic-14795097 > > >
Art Kendall
Social Research Consultants |
In reply to this post by Ornelas, Fermin
All,
I am in need of some help. The basic problem is that I want to do a test of equivalence of means from a paired t-test. I think this falls in the area of bioequivalence. However, google is not so helpful as I would have hoped. Can someone suggest a basic but useful article or book or website for my education. I also like to know how to set this up in spss--I understand that there is not a procedure for this but rather what numbers need to be computed and how should they be combined. Thanks, Gene Maguin |
In reply to this post by Albert-Jan Roskam
It seems to me that collinearity means dependence, that is, if the data
are collinear, there is a dependence. In that sense then we should talk about the degree to which the data approaches collinearity, rather than the degree of collinearity. Data are rarely collinear unless someone makes a mistake with their data. But sometimes the data approaches dependence or collinearity and because of the inability of the software to manage with the finite number of digits available, it causes a problem. Or perhaps I'm wrong and it is collinear that means dependency and collinearity means the degree to which the data approaches being collinear. Again it is the semantics that get in our way. Paul R. Swank, Ph.D. Professor Director of Reseach Children's Learning Institute University of Texas Health Science Center-Houston -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc Sent: Friday, February 02, 2007 9:08 AM To: [hidden email] Subject: Re: Collinearity Jon, Good point. I think most of those who have posted on this topic would agree that collinearity is a matter of degree, not an either/or condition. Perhaps a better way to phrase the initial question in this topic is "How do I assess the magnitude of collinearity among my predictors?" This a particularly interesting topic with respect to logistic regression. Best, Stephen Brand ---- "Peck wrote: > I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome. > > As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. > > One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. > > My two cents. > > Jon Peck > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf > Of Statisticsdoc > Sent: Thursday, February 01, 2007 8:57 PM > To: [hidden email] > Subject: Re: [SPSSX-L] Collinearity > > Stephen Brand > www.statisticsdoc.com > > Albert-jan, > > A great deal of good advice has been given on this topic, particularly > Anita's suggestion to utilize CATREG. Just to add a couple of small > items to the pool, I would suggest the following: > > (1) Perfect collinearity exists when one independent variable can be > predicted by a linear combination of the other independent variables, > so in addition to looking at the bivariate correlations between the > predictors, examine the multiple regression between each predictor and > the other predictors (e.g., to what extent can X1 be predicted by a > weighted combination of X2 and X3). > > (2) If you have a large sample, you might want to consider splitting > it randomly into halves, and conducting the logistic regression > analysis in both halves, or cross-validating the regression weights > from one half in the other half. This approach will give some > indication of how robust the parameter estimates are. > > HTH, > > Stephen Brand > > For personalized and professional consultation in statistics and > research design, visit www.statisticsdoc.com > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf > Of Albert-jan Roskam > Sent: Thursday, February 01, 2007 6:39 AM > To: [hidden email] > Subject: Collinearity > > > Dear list, > > I would like to test for collinearity between three ordinal variables. > The variables have different numbers of values, but are coded in a > similar way, i.e. category 1 is the lowest category for all three > vars. > > I calculated Spearman's rho correlations for these variables. The > correlation coefficient never exceeds .53; well below the generally > used rule-of-thumb that it should not exceed .85. --btw, does anybody > have a good reference for this rule? > > Can I now safely assume that my variables are not collinear when I use > them simultaneously as independent predictors in a logistic regression > analysis? > > Thank you for your replies! > > Albert-Jan > > > > ______________________________________________________________________ > ______ > ________ > Now that's room service! Choose from over 150,000 hotels in 45,000 > destinations on Yahoo! Travel to find your fit. > http://farechase.yahoo.com/promo-generic-14795097 -- For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com |
In reply to this post by Albert-Jan Roskam
Variables that are collinear, as other responses have mentioned is
broadly defined as a relationship of one variable with another or with a group of variables. As expected such relationship is measured by a correlation coefficient whose values are from -1 to 1. The closer to either extreme the more strong the relation would be. As others have already pointed out more often than not this problem will present in empirical work. The question is when it will represent a problem and that is where the collinear diagnostics will help the researcher to determine the severity of the problem. A correlation matrix of the predictor variables should give the researcher a good idea on this. More precisely, the condition index, the variance proportion values, and the variance inflation factors will tell the researcher how serious the problem is. Having said that in my own experience, if my regression model has say 10 predictors and three of them are collinear (condition index < 30, variance proportion < .5, and VIF < 7) then I can live with this. Remember the alternatives to this are very limited. One can try to collect additional data, try ridge regression, try different variables. But if modeling intuition and experience and sign expectations hold then one may decide to keep the model as it is. Of course, if development and validation results are not deteriorating either, then one should leave the model alone, especially if prediction is the main task for the model. As one of my professors used to say collinearity is like bad and good cholesterol. Fermin Ornelas, Ph.D. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Swank, Paul R Sent: Friday, February 02, 2007 10:07 AM To: [hidden email] Subject: Re: Collinearity It seems to me that collinearity means dependence, that is, if the data are collinear, there is a dependence. In that sense then we should talk about the degree to which the data approaches collinearity, rather than the degree of collinearity. Data are rarely collinear unless someone makes a mistake with their data. But sometimes the data approaches dependence or collinearity and because of the inability of the software to manage with the finite number of digits available, it causes a problem. Or perhaps I'm wrong and it is collinear that means dependency and collinearity means the degree to which the data approaches being collinear. Again it is the semantics that get in our way. Paul R. Swank, Ph.D. Professor Director of Reseach Children's Learning Institute University of Texas Health Science Center-Houston -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc Sent: Friday, February 02, 2007 9:08 AM To: [hidden email] Subject: Re: Collinearity Jon, Good point. I think most of those who have posted on this topic would agree that collinearity is a matter of degree, not an either/or condition. Perhaps a better way to phrase the initial question in this topic is "How do I assess the magnitude of collinearity among my predictors?" This a particularly interesting topic with respect to logistic regression. Best, Stephen Brand ---- "Peck wrote: > I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome. > > As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. > > One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. > > My two cents. > > Jon Peck > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf > Of Statisticsdoc > Sent: Thursday, February 01, 2007 8:57 PM > To: [hidden email] > Subject: Re: [SPSSX-L] Collinearity > > Stephen Brand > www.statisticsdoc.com > > Albert-jan, > > A great deal of good advice has been given on this topic, particularly > Anita's suggestion to utilize CATREG. Just to add a couple of small > items to the pool, I would suggest the following: > > (1) Perfect collinearity exists when one independent variable can be > predicted by a linear combination of the other independent variables, > so in addition to looking at the bivariate correlations between the > predictors, examine the multiple regression between each predictor and > the other predictors (e.g., to what extent can X1 be predicted by a > weighted combination of X2 and X3). > > (2) If you have a large sample, you might want to consider splitting > it randomly into halves, and conducting the logistic regression > analysis in both halves, or cross-validating the regression weights > from one half in the other half. This approach will give some > indication of how robust the parameter estimates are. > > HTH, > > Stephen Brand > > For personalized and professional consultation in statistics and > research design, visit www.statisticsdoc.com > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf > Of Albert-jan Roskam > Sent: Thursday, February 01, 2007 6:39 AM > To: [hidden email] > Subject: Collinearity > > > Dear list, > > I would like to test for collinearity between three ordinal variables. > The variables have different numbers of values, but are coded in a > similar way, i.e. category 1 is the lowest category for all three > vars. > > I calculated Spearman's rho correlations for these variables. The > correlation coefficient never exceeds .53; well below the generally > used rule-of-thumb that it should not exceed .85. --btw, does anybody > have a good reference for this rule? > > Can I now safely assume that my variables are not collinear when I use > them simultaneously as independent predictors in a logistic regression > analysis? > > Thank you for your replies! > > Albert-Jan > > > > ______________________________________________________________________ > ______ > ________ > Now that's room service! Choose from over 150,000 hotels in 45,000 > destinations on Yahoo! Travel to find your fit. > http://farechase.yahoo.com/promo-generic-14795097 -- For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed. It may contain information that is privileged and confidential under state and federal law. This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail. Thank you. |
In reply to this post by Swank, Paul R
At 12:06 PM 2/2/2007, Swank, Paul R wrote:
>Data are rarely collinear unless someone makes a mistake with their >data. But sometimes the data approaches dependence or collinearity >because of the the finite number of digits available, it causes a >problem. Or perhaps I'm wrong and it is collinear that means >dependency. I'm not with you on this one. Yes, any collinearity, however small, proves at least partial statistical dependence among the variables. (It's shown in elementary books on statistics that the converse is not true.) *Perfect* collinearity, singularity of the data matrix, does almost always result from mistakes: including variables with a structural linear relationship. A common mistake, that Art Kendall pointed out, is including dummies for all levels of a categorical variable, when there's also a constant in the model. If collinearity is so near that finite precision of computation makes a difference, you can be pretty certain that there's a structural relationship. Modern precision of numbers, and of algorithms, can handle any degree of collinearity likely if there's not a structural relationship. But a degree of collinearity - pairwise, strong correlation between variables; overall, high values of matrix condition index or similar measures - is often found in real data without making gross mistakes. Common hypotheses behind this are that some variables have a partial causal effect on others; or, that variables have partial causal effects on several of the observed variables. As others have said, any degree of collinearity reduces the precision with which parameters like regression coefficients can be estimated. That isn't a problem of limited computation precision; it's as real a measure of uncertainty as any other standard error of estimate. How much collinearity it takes to create a problem varies, mainly with how precise the data is otherwise. Correlations above 0.8 have been mentioned in this discussion. I was recently on a study where a correlation of 0.69 essentially prevented estimating the regression coefficient for either variable. (This was a psychological study using questionnaire instruments, with a sample size of about 50. The correlation was between two variables with little *a priori* relationship. The connection is well up among the 'questions for further research.') If you really want to include a set of variables with high collinearity - and there may well be reasons - transforming the data matrix may help. There are sophisticated techniques like factor analysis, of course. But simpler ones, like replacing two correlated variables by their mean and their difference (when they're similarly scaled) are often illuminating. -Cheers, and onward, Richard |
In reply to this post by Art Kendall
Art-
I would not call this quick and dirty - more like quick and very neat! Thanks, Steve Art said: A very quick and dirty way to locate which variables are involved in the problem is to pretend that all of the x variables are items in a scale and run RELIABILITY. This procedure shows you the SMC - squared multiple correlation- of each variable with the other variables. It also shows you the corrected item-total correlation, the correlation of each item with the sum of the other items. Items that have SMCs (R**2s) of 1.00 have perfect redundancy. The column of SMCs shows the fit of all possible regressions of each variable in the set with all other variables in the set. The SMCs tells you the degree to which variables are collinear (redundant) with the other variables in the set. For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Art Kendall Sent: Friday, February 02, 2007 11:07 AM To: [hidden email] Subject: Re: Collinearity Which variable(s) to drop from the set will depend on the substantive nature of your analysis. Art Kendall Social Research Consultants Peck, Jon wrote: > I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome. > > As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. > > One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. > > My two cents. > > Jon Peck > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc > Sent: Thursday, February 01, 2007 8:57 PM > To: [hidden email] > Subject: Re: [SPSSX-L] Collinearity > > Stephen Brand > www.statisticsdoc.com > > Albert-jan, > > A great deal of good advice has been given on this topic, particularly > Anita's suggestion to utilize CATREG. Just to add a couple of small items > to the pool, I would suggest the following: > > (1) Perfect collinearity exists when one independent variable can be > predicted by a linear combination of the other independent variables, so in > addition to looking at the bivariate correlations between the predictors, > examine the multiple regression between each predictor and the other > predictors (e.g., to what extent can X1 be predicted by a weighted > combination of X2 and X3). > > (2) If you have a large sample, you might want to consider splitting it > randomly into halves, and conducting the logistic regression analysis in > both halves, or cross-validating the regression weights from one half in the > other half. This approach will give some indication of how robust the > parameter estimates are. > > HTH, > > Stephen Brand > > For personalized and professional consultation in statistics and research > design, visit > www.statisticsdoc.com > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of > Albert-jan Roskam > Sent: Thursday, February 01, 2007 6:39 AM > To: [hidden email] > Subject: Collinearity > > > Dear list, > > I would like to test for collinearity between three > ordinal variables. The variables have different > numbers of values, but are coded in a similar way, > i.e. category 1 is the lowest category for all three > vars. > > I calculated Spearman's rho correlations for these > variables. The correlation coefficient never exceeds > .53; well below the generally used rule-of-thumb that > it should not exceed .85. --btw, does anybody have a > good reference for this rule? > > Can I now safely assume that my variables are not > collinear when I use them simultaneously as > independent predictors in a logistic regression > analysis? > > Thank you for your replies! > > Albert-Jan > > > > ____________________________________________________________________________ > ________ > Now that's room service! Choose from over 150,000 hotels > in 45,000 destinations on Yahoo! Travel to find your fit. > http://farechase.yahoo.com/promo-generic-14795097 > > > |
In reply to this post by Art Kendall
Hi dear list,
Thanks a lot for your replies. They have been of great help and I learnt a lot from them! Btw, I was able to find a reference about this topic: Farrar DE, Glauber, R.R. Multicollinearity in Regression Analysis: The Problem Revisited. The Review of Economics and Statistics 1967;49:92-107. Perhaps this is the original source of the classic r=.85 rule-of-thumb. Thanks again! Albert-Jan --- Art Kendall <[hidden email]> wrote: > Depending on the software you are using, you might > get a "rude" message > saying the matrix of x's is singular, perfectly > collinear, the matrix > cannot be inverted, or equivalently that the > determinant is zero and no > further information. The most common human error > causes of this are to > enter the same variable twice, to enter the complete > set of dummy > variables that represent a categorical variable, to > enter subtotals and > grand totals, items and total scores, etc. > > A very quick and dirty way to locate which variables > are involved in the > problem is to pretend that all of the x variables > are items in a scale > and run RELIABILITY. > This procedure shows you the SMC - squared multiple > correlation- of each > variable with the other variables. It also shows > you the corrected > item-total correlation, the correlation of each item > with the sum of the > other items. Items that have SMCs (R**2s) of 1.00 > have perfect > redundancy. The column of SMCs shows the fit of all > possible regressions > of each variable in the set with all other variables > in the set. The > SMCs tells you the degree to which variables are > collinear (redundant) > with the other variables in the set. > > Which variable(s) to drop from the set will depend > on the substantive > nature of your analysis. > > Art Kendall > Social Research Consultants > > > > > > Peck, Jon wrote: > > I'd like to register an objection to the idea of > "testing for collinearity". One can measure the > degree of collinearity in various ways and can look > at the effect - joint confidence intervals that show > the degree of dependence of the estimates - but > there can be no definitive rules about when there is > too much short of perfect collinearity. And > software will take care of that rule for you in ways > varying between helpful and rude. Collinearity is a > matter of degree, not a yes or no outcome. > > > > As long as you don't have experimental data > designed to be orthogonal, you are going to have > collinearity to some degree, and the more there is, > the more unstable the estimates will be, but any > rule short of perfect collinearity is arbitrary. > > > > One useful reality check, collinearity or not, is > this. Consider the accuracy of your variables - say > you believe the values are correct to three or four > significant figures. Then add a random variable to > the variables that is small enough that the values > round to the actual values to that degree of > accuracy. Rerun your estimates and see how much you > care about the differences in the results. > > > > My two cents. > > > > Jon Peck > > > > -----Original Message----- > > From: SPSSX(r) Discussion > [mailto:[hidden email]] On Behalf Of > Statisticsdoc > > Sent: Thursday, February 01, 2007 8:57 PM > > To: [hidden email] > > Subject: Re: [SPSSX-L] Collinearity > > > > Stephen Brand > > www.statisticsdoc.com > > > > Albert-jan, > > > > A great deal of good advice has been given on this > topic, particularly > > Anita's suggestion to utilize CATREG. Just to add > a couple of small items > > to the pool, I would suggest the following: > > > > (1) Perfect collinearity exists when one > independent variable can be > > predicted by a linear combination of the other > independent variables, so in > > addition to looking at the bivariate correlations > between the predictors, > > examine the multiple regression between each > predictor and the other > > predictors (e.g., to what extent can X1 be > predicted by a weighted > > combination of X2 and X3). > > > > (2) If you have a large sample, you might want to > consider splitting it > > randomly into halves, and conducting the logistic > regression analysis in > > both halves, or cross-validating the regression > weights from one half in the > > other half. This approach will give some > indication of how robust the > > parameter estimates are. > > > > HTH, > > > > Stephen Brand > > > > For personalized and professional consultation in > statistics and research > > design, visit > > www.statisticsdoc.com > > > > > > -----Original Message----- > > From: SPSSX(r) Discussion > [mailto:[hidden email]]On Behalf Of > > Albert-jan Roskam > > Sent: Thursday, February 01, 2007 6:39 AM > > To: [hidden email] > > Subject: Collinearity > > > > > > Dear list, > > > > I would like to test for collinearity between > three > > ordinal variables. The variables have different > > numbers of values, but are coded in a similar way, > > i.e. category 1 is the lowest category for all > three > > vars. > > > > I calculated Spearman's rho correlations for these > > variables. The correlation coefficient never > exceeds > > .53; well below the generally used rule-of-thumb > that > > it should not exceed .85. --btw, does anybody have > a > > good reference for this rule? > > > > Can I now safely assume that my variables are not > > collinear when I use them simultaneously as > > independent predictors in a logistic regression > > analysis? > > > > Thank you for your replies! > > > > Albert-Jan > > > > > > > > > > > ________ > > Now that's room service! Choose from over 150,000 > hotels > > in 45,000 destinations on Yahoo! Travel to find > your fit. > > http://farechase.yahoo.com/promo-generic-14795097 > > > > > > > ____________________________________________________________________________________ The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php |
In reply to this post by statisticsdoc
It has worked for me since SPSS included RELIABILITY in the mid-70s.
Art Statisticsdoc wrote: > Art- > > I would not call this quick and dirty - more like quick and very neat! > > Thanks, > > Steve > > Art said: > > A very quick and dirty way to locate which variables are involved in the > problem is to pretend that all of the x variables are items in a scale > and run RELIABILITY. > This procedure shows you the SMC - squared multiple correlation- of each > variable with the other variables. It also shows you the corrected > item-total correlation, the correlation of each item with the sum of the > other items. Items that have SMCs (R**2s) of 1.00 have perfect > redundancy. The column of SMCs shows the fit of all possible regressions > of each variable in the set with all other variables in the set. The > SMCs tells you the degree to which variables are collinear (redundant) > with the other variables in the set. > > > For personalized and professional consultation in statistics and research design, visit > www.statisticsdoc.com > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of > Art Kendall > Sent: Friday, February 02, 2007 11:07 AM > To: [hidden email] > Subject: Re: Collinearity > > > Which variable(s) to drop from the set will depend on the substantive > nature of your analysis. > > Art Kendall > Social Research Consultants > > > > > > Peck, Jon wrote: > >> I'd like to register an objection to the idea of "testing for collinearity". One can measure the degree of collinearity in various ways and can look at the effect - joint confidence intervals that show the degree of dependence of the estimates - but there can be no definitive rules about when there is too much short of perfect collinearity. And software will take care of that rule for you in ways varying between helpful and rude. Collinearity is a matter of degree, not a yes or no outcome. >> >> As long as you don't have experimental data designed to be orthogonal, you are going to have collinearity to some degree, and the more there is, the more unstable the estimates will be, but any rule short of perfect collinearity is arbitrary. >> >> One useful reality check, collinearity or not, is this. Consider the accuracy of your variables - say you believe the values are correct to three or four significant figures. Then add a random variable to the variables that is small enough that the values round to the actual values to that degree of accuracy. Rerun your estimates and see how much you care about the differences in the results. >> >> My two cents. >> >> Jon Peck >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc >> Sent: Thursday, February 01, 2007 8:57 PM >> To: [hidden email] >> Subject: Re: [SPSSX-L] Collinearity >> >> Stephen Brand >> www.statisticsdoc.com >> >> Albert-jan, >> >> A great deal of good advice has been given on this topic, particularly >> Anita's suggestion to utilize CATREG. Just to add a couple of small items >> to the pool, I would suggest the following: >> >> (1) Perfect collinearity exists when one independent variable can be >> predicted by a linear combination of the other independent variables, so in >> addition to looking at the bivariate correlations between the predictors, >> examine the multiple regression between each predictor and the other >> predictors (e.g., to what extent can X1 be predicted by a weighted >> combination of X2 and X3). >> >> (2) If you have a large sample, you might want to consider splitting it >> randomly into halves, and conducting the logistic regression analysis in >> both halves, or cross-validating the regression weights from one half in the >> other half. This approach will give some indication of how robust the >> parameter estimates are. >> >> HTH, >> >> Stephen Brand >> >> For personalized and professional consultation in statistics and research >> design, visit >> www.statisticsdoc.com >> >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of >> Albert-jan Roskam >> Sent: Thursday, February 01, 2007 6:39 AM >> To: [hidden email] >> Subject: Collinearity >> >> >> Dear list, >> >> I would like to test for collinearity between three >> ordinal variables. The variables have different >> numbers of values, but are coded in a similar way, >> i.e. category 1 is the lowest category for all three >> vars. >> >> I calculated Spearman's rho correlations for these >> variables. The correlation coefficient never exceeds >> .53; well below the generally used rule-of-thumb that >> it should not exceed .85. --btw, does anybody have a >> good reference for this rule? >> >> Can I now safely assume that my variables are not >> collinear when I use them simultaneously as >> independent predictors in a logistic regression >> analysis? >> >> Thank you for your replies! >> >> Albert-Jan >> >> >> >> ____________________________________________________________________________ >> ________ >> Now that's room service! Choose from over 150,000 hotels >> in 45,000 destinations on Yahoo! Travel to find your fit. >> http://farechase.yahoo.com/promo-generic-14795097 >> >> >> >> > > > > |
Bear in mind that the Tolerance statistic IS just 1 - R sq of each regressor on all the others. And Partial Correlations will also be helpful in going beyond that summary structure.
-Jon Peck Statisticsdoc wrote: > Art- > > I would not call this quick and dirty - more like quick and very neat! > > Thanks, > > Steve > > Art said: > > A very quick and dirty way to locate which variables are involved in the > problem is to pretend that all of the x variables are items in a scale > and run RELIABILITY. > This procedure shows you the SMC - squared multiple correlation- of each > variable with the other variables. It also shows you the corrected > item-total correlation, the correlation of each item with the sum of the > other items. Items that have SMCs (R**2s) of 1.00 have perfect > redundancy. The column of SMCs shows the fit of all possible regressions > of each variable in the set with all other variables in the set. The > SMCs tells you the degree to which variables are collinear (redundant) > with the other variables in the set. > > > For personalized and professional consultation in statistics and research design, visit > www.statisticsdoc.com > |
Free forum by Nabble | Edit this page |