Multicolinearity.

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Multicolinearity.

Almost Done
Hey, guys! I'm doing a research about creative advertising and have to check for example whether the divergence (rated on a seven point Lickert scale) and relevance (rated the same) and the interaction between the two divergence*relevance has an effect on the attention that the respondents also rated on a 7 point lickert schale. So when I run a regression this is what I get:
 
                                               B             t                    sig
 Constant                                 ,529         ,649 ,518
Divergence                              ,666         4,215            ,000  
Relevance                                ,573         2,275            ,024
Divergence*Relevance              -,091        -2,012           ,046

This seemed weird to me because divergence*relevance has a negative influence on the dependent variable attention. How can it be?
 So I removed the Divergence*Relevance interaction, and this is what I got:

                                              B                 t              sig.
 Constant                              1,892            4,113        ,000
Divergence                            ,398             4,622        ,000
Relevance                              ,090              1,167       ,245

So the B and significance changed drastically. I've tested for multicoliniarity using the VIF. The combination where the Divergence was the dependent variable (and independent : Divergence and Divergence*Relevance) was the one where VIF was greater than 5. All the other combinations were fine (VIF was either 1 or slighter greater than 1).
 
So my question is - what does that mean and how do I proceed? What do I have to do? And how should I explain that? For me it's not important to make a model I just have to see whether thedivergence, relevance and the interaction between the two has an effect on attention and other dependent variables. Could I just remove divergence*relevance interaction out of the model and say that there was multicolinearity?

Also theoretically only divergence should have an impact on attention and not relevance.
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Poes, Matthew Joseph
Certain authors like Baron and Kenny would argue this resulted because you did your interaction (moderation) analysis incorrectly.  They would correctly argue that the same information contained in the interaction term is also contained in the two IV's themselves, and as such are correlated (first explicated in publication by Cohen as I understand it).  This correlation will cause a multi-colinearity problem, and the model coefficients would be inaccurate.  They would go on to say that by mean centering the IV's the correlation is reduced to the product term of the IV's (the interaction term) and as such you have reduced multicolinearity.  More recent research has shown this not to be true, and so with normal OLS regression, your unfortunately stuck in a situation where you can't do what your trying to do and get valid results (which isn't to say that 100's if not 1000's of people don't still do this).

One solution is not to rely on OLS regression methods, and instead turn to a varying parameter model.

Another point to consider is that most people often misinterpret the effects of the IV's, in the presence of the moderator, as a main effect, which has been shown to be incorrect.  As Hayes and Matthes discuss, these are actually conditional effects.  Your model with no interaction term has your main effects, but your model with the interaction term has the conditional effects of each IV, and the interaction is the difference in the conditional effect for a one point change in the interaction effect.  First thing to consider is that you don't have two focal predictors and an interaction between the two, this doesn't fit with theory, and complicates interpretation.  You have one focal predict, and a second moderator variable.  In your example, you need to choose (based on theory), so let's say its Divergence, and then treat relevance as the proposed moderator variable.  To interpret these correctly with regard to the interaction model, .666 is the change in Y for a 1 point in!
 crease in Divergence, when relevance is at 0.  If you think about this, since you are showing that there is an interaction, then the value of M in this case (relevance) is meaningful, and having it at 0 doesn't make the interpretation of divergence all that useful on its own (unless you interpret it in light of the interaction effect, which is best done with plotting).  One advantage of centering the IV's is that the coefficient value for divergence is now it's value when relevance is at the sample mean level.  For an average amount of relevance, divergence changes Y by ### amount.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Almost Done
Sent: Tuesday, August 07, 2012 2:59 PM
To: [hidden email]
Subject: Multicolinearity.

Hey, guys! I'm doing a research about creative advertising and have to check for example whether the divergence (rated on a seven point Lickert scale) and relevance (rated the same) and the interaction between the two divergence*relevance has an effect on the attention that the respondents also rated on a 7 point lickert schale. So when I run a regression this is what I get:

                                               B             t
sig
 Constant                                 ,529         ,649 ,518
Divergence                              ,666         4,215            ,000
Relevance                                ,573         2,275            ,024
Divergence*Relevance              -,091        -2,012           ,046

This seemed weird to me because divergence*relevance has a negative influence on the dependent variable attention. How can it be?
 So I removed the Divergence*Relevance interaction, and this is what I got:

                                              B                 t
sig.
 Constant                              1,892            4,113        ,000
Divergence                            ,398             4,622        ,000
Relevance                              ,090              1,167       ,245

So the B and significance changed drastically. I've tested for multicoliniarity using the VIF. The combination where the Divergence was the dependent variable (and independent : Divergence and Divergence*Relevance) was the one where VIF was greater than 5. All the other combinations were fine (VIF was either 1 or slighter greater than 1).

So my question is - what does that mean and how do I proceed? What do I have to do? And how should I explain that? For me it's not important to make a model I just have to see whether thedivergence, relevance and the interaction between the two has an effect on attention and other dependent variables. Could I just remove divergence*relevance interaction out of the model and say that there was multicolinearity?

Also theoretically only divergence should have an impact on attention and not relevance.



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multicolinearity-tp5714614.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

statisticsdoc
Matthew,

Good point about mean centering. Stats students are exposed to so much
"received wisdom" about the supposed benefits of mean centering that it
might be helpful to mention a couple of papers that will set the inquiring
student in the right direction:

Echambi & Hess (2007).  Mean-Centering Does Not Alleviate Collinearity
Problems in Moderated Multiple Regression Models.  Marketing Science, 26(3),
438-445.

Shieh (2011). Clarifying the role of mean centring in multicollinearity of
interaction effects.  British Journal of Mathematical and Statistical
Psychology, 64(3), 462-477.

Dalal & Zickar (2011).  Some Common Myths About Centering Predictor
Variables in Moderated Multiple Regression and Polynomial Regression.
Organizational Research Methods, 15(3), 339-362.

These papers note that while mean centering may be desirable for other
reasons (interpretability of coefficients), this procedure does not reduce
multicollinearity or otherwise improve the accuracy or sensitivity of the
parameter estimates.

Best,

Stephen Brand

www.StatisticsDoc.com


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Poes, Matthew Joseph
Sent: Tuesday, August 07, 2012 5:08 PM
To: [hidden email]
Subject: Re: Multicolinearity.

Certain authors like Baron and Kenny would argue this resulted because you
did your interaction (moderation) analysis incorrectly.  They would
correctly argue that the same information contained in the interaction term
is also contained in the two IV's themselves, and as such are correlated
(first explicated in publication by Cohen as I understand it).  This
correlation will cause a multi-colinearity problem, and the model
coefficients would be inaccurate.  They would go on to say that by mean
centering the IV's the correlation is reduced to the product term of the
IV's (the interaction term) and as such you have reduced multicolinearity.
More recent research has shown this not to be true, and so with normal OLS
regression, your unfortunately stuck in a situation where you can't do what
your trying to do and get valid results (which isn't to say that 100's if
not 1000's of people don't still do this).

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Rich Ulrich
In reply to this post by Poes, Matthew Joseph
See inserted comments ...

> Date: Tue, 7 Aug 2012 21:08:04 +0000
> From: [hidden email]
> Subject: Re: Multicolinearity.
> To: [hidden email]
>
> Certain authors like Baron and Kenny would argue this resulted because you did your interaction (moderation) analysis incorrectly.

I would say that the word "incorrectly" leaves the impression
that there is a single, simple "correct" answer.  But there is no
doubt that multiplying two positive-valued variables induces an
artefactual correlation that is unwanted because it (a) leaves you
with coefficients that have no clear meaning, including (b) possible
collinearity and suppressor-variable relations.


>      They would correctly argue that the same information contained in the interaction term is also contained in the two IV's themselves, and as such are correlated (first explicated in publication by Cohen as I understand it). This correlation will cause a multi-colinearity problem, and the model coefficients would be inaccurate. They would go on to say that by mean centering the IV's the correlation is reduced to the product term of the IV's (the interaction term) and as such you have reduced multicolinearity.

 - The mean-centering is relevant only to computing the interaction.
 - If your computer regression procedure is computing the interaction-term,
then you might need to center the original variables -- *if*  you are wanting
to say anything about the main effects.
 - The situation gets more complicated for 3-way interactions.

>      More recent research has shown this not to be true,

I don't know what the "this" is referring to.  But "more recent
research" is not going to refute the simple arithmetic that shows
the reduction of multicollinearity from centering.


>       and so with normal OLS regression, your unfortunately stuck in a situation where you can't do what your trying to do and get valid results (which isn't to say that 100's if not 1000's of people don't still do this).

Hmm... It seems to me that you are casting aspersions on the usual
procedures, after a sloppy and misleading review.  There are potentially
deep issues in looking at non-orthogonal designs, both for main effects
and for interactions.  ... That is the context where I would agree that
there is some discussion about what sort are most valid, and when. 

But most applications, and most results, do not require such deep thought.

>
> One solution is not to rely on OLS regression methods, and instead turn to a varying parameter model.
>
> Another point to consider is that most people often misinterpret the effects of the IV's, in the presence of the moderator, as a main effect, which has been shown to be incorrect. As Hayes and Matthes discuss, these are actually conditional effects. Your model with no interaction term has your main effects, but your model with the interaction term has the conditional effects of each IV, and the interaction is the difference in the conditional effect for a one point change in the interaction effect. First thing to consider is that you don't have two focal predictors and an interaction between the two, this doesn't fit with theory, and complicates interpretation. You have one focal predict, and a second moderator variable. In your example, you need to choose (based on theory), so let's say its Divergence, and then treat relevance as the proposed moderator variable. To interpret these correctly with regard to the interaction model, .666 is the change in Y for a 1 point in!
> crease in Divergence, when relevance is at 0. If you think about this, since you are showing that there is an interaction, then the value of M in this case (relevance) is meaningful, and having it at 0 doesn't make the interpretation of divergence all that useful on its own (unless you interpret it in light of the interaction effect, which is best done with plotting). One advantage of centering the IV's is that the coefficient value for divergence is now it's value when relevance is at the sample mean level. For an average amount of relevance, divergence changes Y by ### amount.
>
> Matthew J Poes
>...
[snip, original]

About the original -- centering the variables as multiplied for
the interaction will give the *usual*, most desired values
for coefficients.  Does the Original Poster find these to be
good enough to answer his question?

Plotting the predictions will answer the more concrete question
of what the interaction "means".

--
Rich Ulrich
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Garry Gelade

Gentlemen,

 

I’m with Rich on this one.  Centering the IVs DOES reduce the multicollinearity between the IVs and their interaction. 

 

It was this multicollinearity which produced a VIF of 5 when an interaction term was included in AlmostDone’s model, but a VIF around 1 (indiocating no collinearity) when the interaction was excluded.  Hje could make the high VIF go away by centering both IVS before calculating the interaction term.

 

This type of collinearity is what Dalal and Zickar call “non-essential” multi-collinearity, and they clearly say that centering does reduce it, which is what Rich says and what tradition tells us.  It does not of course reduce any collinearity between the first order IVs themselves (“essential” collinearity).

 

Centering is generally a good idea because it removes non-essential muclticollinearity, and then if you get a high VIF or other indications of high multicollinearity when you run your regression, you will know it is due to essential multicollinearity, and not non-essential multicollinearity. 

 

Garry Gelade

Business Analytic Ltd

 

Dalal & Zickar (2011).  Some Common Myths About Centering Predictor Variables in Moderated Multiple Regression and Polynomial Regression. Organizational Research Methods, 15(3), 339-362

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Rich Ulrich
Sent: 08 August 2012 07:10
To: [hidden email]
Subject: Re: Multicolinearity.

 

See inserted comments ...

> Date: Tue, 7 Aug 2012 21:08:04 +0000
> From: [hidden email]
> Subject: Re: Multicolinearity.
> To: [hidden email]
>
> Certain authors like Baron and Kenny would argue this resulted because you did your interaction (moderation) analysis incorrectly.

I would say that the word "incorrectly" leaves the impression
that there is a single, simple "correct" answer.  But there is no
doubt that multiplying two positive-valued variables induces an
artefactual correlation that is unwanted because it (a) leaves you
with coefficients that have no clear meaning, including (b) possible
collinearity and suppressor-variable relations.


>      They would correctly argue that the same information contained in the interaction term is also contained in the two IV's themselves, and as such are correlated (first explicated in publication by Cohen as I understand it). This correlation will cause a multi-colinearity problem, and the model coefficients would be inaccurate. They would go on to say that by mean centering the IV's the correlation is reduced to the product term of the IV's (the interaction term) and as such you have reduced multicolinearity.

 - The mean-centering is relevant only to computing the interaction.
 - If your computer regression procedure is computing the interaction-term,
then you might need to center the original variables -- *if*  you are wanting
to say anything about the main effects.
 - The situation gets more complicated for 3-way interactions.

>      More recent research has shown this not to be true,

I don't know what the "this" is referring to.  But "more recent
research" is not going to refute the simple arithmetic that shows
the reduction of multicollinearity from centering.


>       and so with normal OLS regression, your unfortunately stuck in a situation where you can't do what your trying to do and get valid results (which isn't to say that 100's if not 1000's of people don't still do this).

Hmm... It seems to me that you are casting aspersions on the usual
procedures, after a sloppy and misleading review.  There are potentially
deep issues in looking at non-orthogonal designs, both for main effects
and for interactions.  ... That is the context where I would agree that
there is some discussion about what sort are most valid, and when. 

But most applications, and most results, do not require such deep thought.

>
> One solution is not to rely on OLS regression methods, and instead turn to a varying parameter model.
>
> Another point to consider is that most people often misinterpret the effects of the IV's, in the presence of the moderator, as a main effect, which has been shown to be incorrect. As Hayes and Matthes discuss, these are actually conditional effects. Your model with no interaction term has your main effects, but your model with the interaction term has the conditional effects of each IV, and the interaction is the difference in the conditional effect for a one point change in the interaction effect. First thing to consider is that you don't have two focal predictors and an interaction between the two, this doesn't fit with theory, and complicates interpretation. You have one focal predict, and a second moderator variable. In your example, you need to choose (based on theory), so let's say its Divergence, and then treat relevance as the proposed moderator variable. To interpret these correctly with regard to the interaction model, .666 is the change in Y for a 1 point in!
> crease in Divergence, when relevance is at 0. If you think about this, since you are showing that there is an interaction, then the value of M in this case (relevance) is meaningful, and having it at 0 doesn't make the interpretation of divergence all that useful on its own (unless you interpret it in light of the interaction effect, which is best done with plotting). One advantage of centering the IV's is that the coefficient value for divergence is now it's value when relevance is at the sample mean level. For an average amount of relevance, divergence changes Y by ### amount.
>
> Matthew J Poes
>...
[snip, original]

About the original -- centering the variables as multiplied for
the interaction will give the *usual*, most desired values
for coefficients.  Does the Original Poster find these to be
good enough to answer his question?

Plotting the predictions will answer the more concrete question
of what the interaction "means".

--
Rich Ulrich

Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Almost Done
Hey, all! So I did center the IV's and run another regression with centered IV's and their interaction. The multicolinearity went away (the VIF is slightly higher than one - 1.084). But the beta for the interaction term is still negative.That's very weird. Although the relevance IV is not significant (which it shouldn't be in theory).
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Poes, Matthew Joseph
In reply to this post by Rich Ulrich

Or maybe it’s you who has been hasty here.  The point I made was that mean centering does not chance the effect of the multicollinearity on the model in any interpretable way, and in fact, interpreted correctly, the coefficients reflect the exact same thing, with the exact same problems as before.  Here is a small sampling of articles I either have or found.  This concept is well established and I know of no legitimate researchers in the social sciences publishing articles showing anything different than what I and the authors below have stated.  You can’t argue with the facts, and if you run a model both ways (centered vs raw) you will see there is no meaningful difference in the coefficients with the exception of the obvious change due to the centering making the coefficients relative to the mean (for b1 and b2).  The titles of these articles alone should tell you something.  I’m sure you will continue to disagree and try to either ignore the vast published literature to the contrary of your point and show me how the correlation is reduced (my first stats professor did the same thing), or try to take apart their arguments.  All I can say is that I trust the views of multiple authors who came to the same conclusions, in the many cases independent of one another, and have all recommended against mean centering for fixing multcolinearity.  On top of that, I work with people in the Institute of Educational Sciences, Westat, ABT Associates, my own universities Stat’s department, and numerous researchers who act as journal and grant reviewers, and this has consistently been the view they take when it comes up, don’t bother mean centering.     

 

To quote Kromrey and Foster-Johnson: “…The equations obtained with centered and raw data are equivalent, that the results of the hypothesis testing with either type of data are exactly the same, and that neither approach provides a viable vehicle for the interpretation of main effects in MMR.”

Kromrey, J.D., Foster-Johnson, L. (1998).  Mean Centering in Moderated Multiple Regression: Much Ado About Nothing. Educational and Psychological Measurement, 58(1), 42-67.

 

Cohen states in his article (1978) “One might just as well not bother” when referring to centering.

Cohen, J. (1978). Partialed products are interactions; partialed powers are curve components. Psychological Bulletin, 85, 858-866.

 

Dunlap and Kemery both have articles together in 1987 and 1988 addressing the issue of multicolinearity, they too find that centering is not appropriate(to fix this problem).

 

Echambadi, R. (2007).  Mean centering does not alleviate collinearity problems in moderated multiple regression models.  Marketing Science.

 

Edwards, J.R. (2009). Seven deadly myths of testing moderation in organizational research.  In Statistical and methodological myths and Urban Legengs: Doctrine, Verity and Fable in the organizational and social sciences.

 

Shieh, G. (2011).  Clarifying the role of mean centering in multicollinearity of interaction effects.  Brittish journal of mathematical and statistical psychology.  64(3), 462-477.

 

 

 

Matthew J Poes

Research Data Specialist

Center for Prevention Research and Development

University of Illinois

510 Devonshire Dr.

Champaign, IL 61820

Phone: 217-265-4576

email: [hidden email]

 

 

From: Rich Ulrich [mailto:[hidden email]]
Sent: Wednesday, August 08, 2012 1:10 AM
To: Poes, Matthew Joseph; SPSS list
Subject: RE: Multicolinearity.

 

See inserted comments ...

> Date: Tue, 7 Aug 2012 21:08:04 +0000
> From: [hidden email]
> Subject: Re: Multicolinearity.
> To: [hidden email]
>
> Certain authors like Baron and Kenny would argue this resulted because you did your interaction (moderation) analysis incorrectly.

I would say that the word "incorrectly" leaves the impression
that there is a single, simple "correct" answer.  But there is no
doubt that multiplying two positive-valued variables induces an
artefactual correlation that is unwanted because it (a) leaves you
with coefficients that have no clear meaning, including (b) possible
collinearity and suppressor-variable relations.


>      They would correctly argue that the same information contained in the interaction term is also contained in the two IV's themselves, and as such are correlated (first explicated in publication by Cohen as I understand it). This correlation will cause a multi-colinearity problem, and the model coefficients would be inaccurate. They would go on to say that by mean centering the IV's the correlation is reduced to the product term of the IV's (the interaction term) and as such you have reduced multicolinearity.

 - The mean-centering is relevant only to computing the interaction.
 - If your computer regression procedure is computing the interaction-term,
then you might need to center the original variables -- *if*  you are wanting
to say anything about the main effects.
 - The situation gets more complicated for 3-way interactions.

>      More recent research has shown this not to be true,

I don't know what the "this" is referring to.  But "more recent
research" is not going to refute the simple arithmetic that shows
the reduction of multicollinearity from centering.


>       and so with normal OLS regression, your unfortunately stuck in a situation where you can't do what your trying to do and get valid results (which isn't to say that 100's if not 1000's of people don't still do this).

Hmm... It seems to me that you are casting aspersions on the usual
procedures, after a sloppy and misleading review.  There are potentially
deep issues in looking at non-orthogonal designs, both for main effects
and for interactions.  ... That is the context where I would agree that
there is some discussion about what sort are most valid, and when. 

But most applications, and most results, do not require such deep thought.

>
> One solution is not to rely on OLS regression methods, and instead turn to a varying parameter model.
>
> Another point to consider is that most people often misinterpret the effects of the IV's, in the presence of the moderator, as a main effect, which has been shown to be incorrect. As Hayes and Matthes discuss, these are actually conditional effects. Your model with no interaction term has your main effects, but your model with the interaction term has the conditional effects of each IV, and the interaction is the difference in the conditional effect for a one point change in the interaction effect. First thing to consider is that you don't have two focal predictors and an interaction between the two, this doesn't fit with theory, and complicates interpretation. You have one focal predict, and a second moderator variable. In your example, you need to choose (based on theory), so let's say its Divergence, and then treat relevance as the proposed moderator variable. To interpret these correctly with regard to the interaction model, .666 is the change in Y for a 1 point in!
> crease in Divergence, when relevance is at 0. If you think about this, since you are showing that there is an interaction, then the value of M in this case (relevance) is meaningful, and having it at 0 doesn't make the interpretation of divergence all that useful on its own (unless you interpret it in light of the interaction effect, which is best done with plotting). One advantage of centering the IV's is that the coefficient value for divergence is now it's value when relevance is at the sample mean level. For an average amount of relevance, divergence changes Y by ### amount.
>
> Matthew J Poes
>...
[snip, original]

About the original -- centering the variables as multiplied for
the interaction will give the *usual*, most desired values
for coefficients.  Does the Original Poster find these to be
good enough to answer his question?

Plotting the predictions will answer the more concrete question
of what the interaction "means".

--
Rich Ulrich

Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Poes, Matthew Joseph
In reply to this post by statisticsdoc
Thanks Stephen,
        I also cited a few articles in a response to Rich's email, as he apparently doesn't by the evidence.

        I should note that I made a comment that you can't get "valid" results using OLS regression because the multicolinearity exists and can't be eliminated.  This is overstating things, and the real argument is that the coefficients themselves should be unbiased even in light of the multicolinearity, give you valid coefficients, but possibly have some stability issues in the model.  I've heard a lot of arguments on this, obviously read a lot of papers on the topic, and over the last 5 years, I myself have been involved in numerous roundtable discussions trying to develop a best practice recommendation for the testing of interactions (moderation).  These best practices are intended to go into white papers that would go along with certain federal grants, basically stating that results will only be considered valid if they follow certain procedures or meet certain criteria.  Moderation is one that has simply not been easy to nail down.  There are some very knowledgeable fol!
 ks who are arguing very adamantly that OLS is completely inappropriate, and have suggested models utilizing SEM, varying parameter models, and even some suggestions of developing models using instrumental variables in an SEM structure (this was an off the cuff comment, I believe it to actually be incorrect, FYI).

        So while I'm sure all this is confusing to the young stat students out there, let me just say:  Moderation effects (Interactions) are going to be a commonly tested phenomena in social sciences.  It's very likely that you will be taught two things that are incorrect in your stats courses, and this may even happen at graduate level courses (it did for me).  The first is that mean centering solves the problem created by multicolinearity, it does not.  The second is that it makes for easy interpretation of main effects, and while centering does make it easier to interpret main effects, the model that contains the interaction coefficient does not have main effects, it has conditional effects.  You will likely be taught to use a 2-step regression, the first step of that regression will contain the main effects, and the second step will add the interaction term, and in this step, the two "main effects" turn into conditional effects, as they are now conditional on the value !
 of the interaction effect.  The next thing I think you need to know is that most everyone has probably interpreted an interaction model incorrectly at least once in their life (myself included).  Try to remember what I've said here, review the literature, tons of recent literature exists on interpreting interactions correctly, and carefully consider the interaction you are trying to interpret, and what it likely means.  The final thing I'll say is that plotting is your friend.  It takes a while to wrap your head around the interpretation of conditional effects like interactions, and as such, I find that plots make it much easier.  In fact, I consider any paper written that contains an interaction and hasn't plotted it to be unacceptable, unless it's in a statistics journal, as I don't consider the average audience of these journals capable of easily interpreting the effects.

        One last story, in case it helps:  In my last PhD stats course I had a professor who lightly covered this topic.  His knowledge of the core topic at hand, MLM/HLM was great, but his knowledge of interactions was dated.  While I accepted what I was taught in earlier years, by this point I had read enough articles and run enough models to know this wasn't true.  We were learning about lower level moderation in multi-level models.  I argued with the professor that while all of the lesson made sense as a whole, his insistence that he would mark our homework and test questions wrong if we didn't mean center for the interaction was in error.  In fact I said that mean centering for MLM was, at his own admission, solely for interpretation purposes, and that if someone didn't do it, but correctly interpreted the effects, they should not be marked off, as this would simply reflect a deeper understanding than most.  I presented him with all the evidence, and his final statement!
  to me was that his course was approved by the department, this was the departments accepted view, and while he agreed the literature appears to suggest their view is wrong, he will still hold us all to it.  Obviously I went along, centered as he requested in both the single level and multilevel models.  However, the following year, when the course was offered again, the department had met and decided recent (a relative term since some of this dates back to the 80's)has shown the hard stance they took was wrong.  The course was changed, our departments graduate stats courses were changed, and four separate dissertations were given a revise and resubmit due to explicit claims of fixing multicolinearity.  I think this story is important because its telling as to how engrained these ideas are.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]


-----Original Message-----
From: StatisticsDoc [mailto:[hidden email]]
Sent: Tuesday, August 07, 2012 7:39 PM
To: Poes, Matthew Joseph; [hidden email]
Subject: RE: Multicolinearity.

Matthew,

Good point about mean centering. Stats students are exposed to so much "received wisdom" about the supposed benefits of mean centering that it might be helpful to mention a couple of papers that will set the inquiring student in the right direction:

Echambi & Hess (2007).  Mean-Centering Does Not Alleviate Collinearity Problems in Moderated Multiple Regression Models.  Marketing Science, 26(3), 438-445.

Shieh (2011). Clarifying the role of mean centring in multicollinearity of interaction effects.  British Journal of Mathematical and Statistical Psychology, 64(3), 462-477.

Dalal & Zickar (2011).  Some Common Myths About Centering Predictor Variables in Moderated Multiple Regression and Polynomial Regression.
Organizational Research Methods, 15(3), 339-362.

These papers note that while mean centering may be desirable for other reasons (interpretability of coefficients), this procedure does not reduce multicollinearity or otherwise improve the accuracy or sensitivity of the parameter estimates.

Best,

Stephen Brand

www.StatisticsDoc.com


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Poes, Matthew Joseph
Sent: Tuesday, August 07, 2012 5:08 PM
To: [hidden email]
Subject: Re: Multicolinearity.

Certain authors like Baron and Kenny would argue this resulted because you did your interaction (moderation) analysis incorrectly.  They would correctly argue that the same information contained in the interaction term is also contained in the two IV's themselves, and as such are correlated (first explicated in publication by Cohen as I understand it).  This correlation will cause a multi-colinearity problem, and the model coefficients would be inaccurate.  They would go on to say that by mean centering the IV's the correlation is reduced to the product term of the IV's (the interaction term) and as such you have reduced multicolinearity.
More recent research has shown this not to be true, and so with normal OLS regression, your unfortunately stuck in a situation where you can't do what your trying to do and get valid results (which isn't to say that 100's if not 1000's of people don't still do this).

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Swank, Paul R
In reply to this post by Almost Done
It's not weird, as I suggested yesterday. It merely means that the effect of each variable on the outcome is diminished in the presence of high values on the other variable. It's why we study interactions in the first place, to see if the relation of one variable to the outcome varies as a function of another predictor. The answer simply is yes in this case.

Paul R. Swank, Ph.D.
Professor, Department of Pediatrics
Medical School
Adjunct Professor, Health Promotions and Behavioral Sciences
School of Public Health
University of Texas Health Science Center at Houston


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Almost Done
Sent: Wednesday, August 08, 2012 7:57 AM
To: [hidden email]
Subject: Re: Multicolinearity.

Hey, all! So I did center the IV's and run another regression with centered IV's and their interaction. The multicolinearity went away (the VIF is slightly higher than one - 1.084). But the beta for the interaction term is still negative.That's very weird. Although the relevance IV is not significant (which it shouldn't be in theory).



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multicolinearity-tp5714614p5714623.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Poes, Matthew Joseph
In reply to this post by Almost Done
You can read my point on the actual effect of the centering on the model, it lowers the VIF yes, but doesn't actually solve the problems of the multicolinearity.  The reason is unimportant for you at this stage.  Centering is certainly still useful for interpretation.

At this point your coefficients reflect distance from the average.  You should have run a 2 step model, correct, and when you did, in the first step you will get coefficients for the main effects.  These are your main effects to report.  They now reflect a change from the average level of x and z of one point's impact on the outcome.  B1 is thus the amount of change in that outcome for a one point increase over average, and the same for B2.  As for the second step, when you introduce the interaction term, this now reflects the difference in slope of the focal variable for different values of the moderator.  In addition to this new piece of information, you really need to test this, known as the region of significance.  For two continuous variables, a continuous focal variable and a continuous moderator, the simple slopes test is commonly used.  This gives you regions of significance, and will make the interpretation much clearer.

As for what the negative coefficient means, without testing the regions of significance, it's difficult to know, it may be more clear than you realize.  Since it's now based on average levels, the negative coefficient seems to suggest that as you increase z, the effect of x on y1 is reduced by the value of xz.  Some would argue that, in the 2 step regression, if the coefficient for z becomes non-significant, that it may reflect that its impact on y is only in light of its interactive effect with x.  While possibly true, it could also be an artifact from the problems we have already been discussing, which unfortunately you can't fix.

I strongly suspect that once you plot out the regions of significance (Mean, +/- 1SD) that you will have an answer to your question that makes more sense.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Almost Done
Sent: Wednesday, August 08, 2012 7:57 AM
To: [hidden email]
Subject: Re: Multicolinearity.

Hey, all! So I did center the IV's and run another regression with centered IV's and their interaction. The multicolinearity went away (the VIF is slightly higher than one - 1.084). But the beta for the interaction term is still negative.That's very weird. Although the relevance IV is not significant (which it shouldn't be in theory).



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multicolinearity-tp5714614p5714623.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Almost Done
In reply to this post by Swank, Paul R
No, it's just based on the research that I have done and other similar surveys there should be a positive interaction effect between divergence and relevance. It says in the theory that those two effects when put together should have a more than additive effect on other variables like attention.
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Swank, Paul R
Now you have some evidence that refutes that theory.

Paul R. Swank, Ph.D.
Professor, Department of Pediatrics
Medical School
Adjunct Professor, Health Promotions and Behavioral Sciences
School of Public Health
University of Texas Health Science Center at Houston

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Almost Done
Sent: Wednesday, August 08, 2012 10:33 AM
To: [hidden email]
Subject: Re: Multicolinearity.

No, it's just based on the research that I have done and other similar surveys there should be a positive interaction effect between divergence and relevance. It says in the theory that those two effects when put together should have a more than additive effect on other variables like attention.



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multicolinearity-tp5714614p5714629.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicolinearity.

Rich Ulrich
In reply to this post by Poes, Matthew Joseph
I think that there are two issues here.  One is the abstract, statistical purism,
and the other is communication among people who are not mathematical
statisticians.  But in any case, I think you are mis-reading or mis-using the
statistical advice that you cite.  As you say,
     To quote Kromrey and Foster-Johnson: “…The equations obtained with centered and raw data
    are equivalent, that the results of the hypothesis testing with either type of data are exactly
    the same, and that *neither* [emphasis added] approach provides a viable vehicle for the
    interpretation of main effects in MMR.”


They are "equivalent" when you don't look at separate coefficients
and separate tests.  However, the equation without "non-essential
multi-collinearity" (thanks, Garry) is both "more interpretable"  and more
robust on replication.  The OP sees some advantage in his own data.

For purist interpretation of variance components, every non-experimental
design is problematic.  I like Searle's approach of hierarchical reduction
of SS, but not everyone does.  Even Searle's leaves questions if there
is no obvious ordering for terms.  K&F-J apparently don't like centering,
but they don't like non-centering, either.  But "non-essential multi-collinearity"
is patently a "problem" that centering does fix.

I suspect, further, that we may be focusing on slightly different paradigms.

If you are considering the interaction as the important test, then I
possibly don't like your whole model.  If you are considering the
coefficients for the main effects to be largely irrelevant -- which is what
I conclude from your comments on "interpretation" -- you are not
dealing with the investigators or publications that I have dealt with.

I see the main-effects model as primary and the interaction terms as
secondary.  I look at the tests on main effects *before* entering the
interactions, so I, personally, don't ever try to interpret the coefficients
that are confounded by the induced, non-essential collinearity.  I do the
subtraction-of-means by firmly-established habit, so that I will not be
distracted by irrelevant diagnostics.  If and when the interaction is
significant, *that* is not an encouraging result; it warns me that my
scaling may be wrong, or my construction of the model may be wrong
(or I am testing it in an inappropriate sample).  So I look at the plots
to discover what is going on, especially with an eye on fixing the model.
Psychologists used to say, "Don't interpret the main effects if there is
an interaction."  That is useful advice when the interaction is big, but
it overstates the problem.  Look at the plots, and describe what is there.

I think you are misusing your experts and authorities.  I have answered
thousands of questions on the Usenet groups sci.stat.* (plus, the SPSS
group) since 1995.  I have argued aspects of regression a number of
times -- Early on, I chased down and rebutted mis-used citations that
were intended to support "stepwise regression"; the textbooks by real
statisticians always were fine on that issue (I was happy to see), and the
citations offered had been partial or ripped out of context.  There were
many other Posters who confirmed my own point of view.  I do not
remember anyone, in all that time, who pushed an argument for never
bothering to center when computing interactions.




Date: Wed, 8 Aug 2012 13:45:42 +0000
From: [hidden email]
Subject: Re: Multicolinearity.
To: [hidden email]

Or maybe it’s you who has been hasty here.  The point I made was that mean centering does not chance the effect of the multicollinearity on the model in any interpretable way, and in fact, interpreted correctly, the coefficients reflect the exact same thing, with the exact same problems as before.  Here is a small sampling of articles I either have or found.  This concept is well established and I know of no legitimate researchers in the social sciences publishing articles showing anything different than what I and the authors below have stated.  You can’t argue with the facts, and if you run a model both ways (centered vs raw) you will see there is no meaningful difference in the coefficients with the exception of the obvious change due to the centering making the coefficients relative to the mean (for b1 and b2).  The titles of these articles alone should tell you something.  I’m sure you will continue to disagree and try to either ignore the vast published literature to the contrary of your point and show me how the correlation is reduced (my first stats professor did the same thing), or try to take apart their arguments.  All I can say is that I trust the views of multiple authors who came to the same conclusions, in the many cases independent of one another, and have all recommended against mean centering for fixing multcolinearity.  On top of that, I work with people in the Institute of Educational Sciences, Westat, ABT Associates, my own universities Stat’s department, and numerous researchers who act as journal and grant reviewers, and this has consistently been the view they take when it comes up, don’t bother mean centering.     

 

To quote Kromrey and Foster-Johnson: “…The equations obtained with centered and raw data are equivalent, that the results of the hypothesis testing with either type of data are exactly the same, and that neither approach provides a viable vehicle for the interpretation of main effects in MMR.”

Kromrey, J.D., Foster-Johnson, L. (1998).  Mean Centering in Moderated Multiple Regression: Much Ado About Nothing. Educational and Psychological Measurement, 58(1), 42-67.

 

Cohen states in his article (1978) “One might just as well not bother” when referring to centering.

Cohen, J. (1978). Partialed products are interactions; partialed powers are curve components. Psychological Bulletin, 85, 858-866.

 

Dunlap and Kemery both have articles together in 1987 and 1988 addressing the issue of multicolinearity, they too find that centering is not appropriate(to fix this problem).

 

Echambadi, R. (2007).  Mean centering does not alleviate collinearity problems in moderated multiple regression models.  Marketing Science.

 

Edwards, J.R. (2009). Seven deadly myths of testing moderation in organizational research.  In Statistical and methodological myths and Urban Legengs: Doctrine, Verity and Fable in the organizational and social sciences.

 

Shieh, G. (2011).  Clarifying the role of mean centering in multicollinearity of interaction effects.  Brittish journal of mathematical and statistical psychology.  64(3), 462-477.

 

 

 

Matthew J Poes...
[snip, previous]

Reply | Threaded
Open this post in threaded view
|

Intrarater reliability with >/= 2 raters and binary outcomes

Jill Stoltzfus
In reply to this post by Garry Gelade
Hello everyone. Can anyone advise me on whether SPSS allows you to calculate either ICCs or concordance correlation coefficients for multiple raters with binary data (y/n), analogous to using Fleiss' kappa for interrater reliability? Specifically, I want to assess how well 12 different orthopedic surgeons classify 5 different traction views of the hand for each patient (total number of patients = 17).

Thanks very much for your advice.

Jill
Reply | Threaded
Open this post in threaded view
|

Re: Intrarater reliability with >/= 2 raters and binary outcomes

David Marso
Administrator
Jim,
  Please start a NEW topic rather than replying to an old topic and changing the subject heading.
When you do the latter the threading get's FUBAR in Nabble (the best way IMNSHO to follow this list) and people who could really care less about your new completely unrelated question receive unnecessary email.  Furthermore your new query becomes buried in the bowels of an old closed issue and many people who might otherwise respond to your question will never see it!
--
---
Jill Stoltzfus wrote
Hello everyone. Can anyone advise me on whether SPSS allows you to calculate either ICCs or concordance correlation coefficients for multiple raters with binary data (y/n), analogous to using Fleiss' kappa for interrater reliability? Specifically, I want to assess how well 12 different orthopedic surgeons classify 5 different traction views of the hand for each patient (total number of patients = 17).

Thanks very much for your advice.

Jill
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Intrarater reliability with >/= 2 raters and binary outcomes

bdates
In reply to this post by Jill Stoltzfus

Jill,

 

I think you probably will need to use Fleiss.  Your data are nominal, and the ICC has been extended to ordinal data but not to nominal.  Concordance stats are typically for rank or interval data.  Fleiss’ kappa will provide you with what you need for your nominal data.  You should probably assess each view separately (n=17), and then the full 85 views.  That way you can identify any views that may be posing difficulty in agreement.

 

Brian

 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jill Stoltzfus
Sent: Monday, December 10, 2012 9:06 AM
To: [hidden email]
Subject: Intrarater reliability with >/= 2 raters and binary outcomes

 

Hello everyone. Can anyone advise me on whether SPSS allows you to calculate either ICCs or concordance correlation coefficients for multiple raters with binary data (y/n), analogous to using Fleiss' kappa for interrater reliability? Specifically, I want to assess how well 12 different orthopedic surgeons classify 5 different traction views of the hand for each patient (total number of patients = 17).

Thanks very much for your advice.

Jill