Hey all,
a colleague of mine just asked me a question about regression analysis. He's reviewing a paper in which two models are tested. The first model contains only one predictor. In the second model, a second predictor is added. The beta of the first predictor is higher in the second model. He was wondering whether this is possible. I don't have sufficient conceptual understanding of this to be able to figure this out. My intuition tell me that it isn't impossible, but I can't think of any way in which it would make sense. Sadly, we don't have the dataset; just the weights. Has anybody here ever encountered something like this before? This is no life-threatening issue, but interesting nonetheless :-) Thanks in advance and kind regards, Gjalt-Jorn ________________________________________ Gjalt-Jorn Ygram Peters Phd. Student Department of Experimental Psychology Faculty of Psychology University of Maastricht |
Hi Gjalt-Jorn,
My first suspection is that there are some missings involved. Look at this example: data list list / x1 x2 y. begin data. 1 1 1 2 2 2 4 1 3 3 . 1 2 . 2 1 . 3 end data. REGRESSION /DEPENDENT y /METHOD=ENTER x1 . REGRESSION /DEPENDENT y /METHOD=ENTER x1 x2 . Missings in x2 "shadow" cases where x1 is not well correlated with y. So Beta1 is much higher in the second regression, but there are less cases there. Greetings Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peters Gj (PSYCHOLOGY) Sent: Wednesday, August 16, 2006 9:46 AM To: [hidden email] Subject: Interesting phenomenon Hey all, a colleague of mine just asked me a question about regression analysis. He's reviewing a paper in which two models are tested. The first model contains only one predictor. In the second model, a second predictor is added. The beta of the first predictor is higher in the second model. He was wondering whether this is possible. I don't have sufficient conceptual understanding of this to be able to figure this out. My intuition tell me that it isn't impossible, but I can't think of any way in which it would make sense. Sadly, we don't have the dataset; just the weights. Has anybody here ever encountered something like this before? This is no life-threatening issue, but interesting nonetheless :-) Thanks in advance and kind regards, Gjalt-Jorn ________________________________________ Gjalt-Jorn Ygram Peters Phd. Student Department of Experimental Psychology Faculty of Psychology University of Maastricht |
Dear Jan (& list),
>> The first model contains only one predictor. In the second model, a >> second predictor is added. The beta of the first predictor is higher >> in the second model. > My first suspection is that there are some missings involved. Look > at this example: > [..] > Missings in x2 "shadow" cases where x1 is not well correlated with y. > So Beta1 is much higher in the second regression, but there are less > cases there. This sounds very plausible. Are you implying by offering this solution that it is not possible that there is a dataset in which the association between X1 and Y is stronger when you control for X2, then when you do not control for X2? In any case, thank you very much for your answer! My colleague was quite relieved :-) (still, I find it interesting whether this phenomenon it would be possible without missing cases :-)) Kind regards, GjY ________________________________________________________________________ ____ Gjalt-Jorn Ygram Peters ## Phd. Student ## Visit: Department of Experimental Psychology Room 3.004 Faculty of Psychology Universiteitssingel 5 University of Maastricht 6229 ES Maastricht The Netherlands ## Contact: P.O. Box 616 ## Private: 6200 MD Maastricht Alexander Battalaan 68d The Netherlands 6221 CE Maastricht The Netherlands Phone: +31 43 388 4508 Fax: +31 43 388 4211 Phone: +31 (0) 62 120 0009 Mail: [hidden email] Mail: [hidden email] Msn: [hidden email] Web: http://interventionmapping.nl Web: http://www.gjyp.nl |
In reply to this post by Peters Gj (PSYCHOLOGY)
Hi again, Gjalt-Jorn,
The phenomemon is possible even without missing values, but IMHO not very common in practice. Look at this example: data list list / x1 x2 y. begin data. 1 0 1 2 0 2 4 0 3 3 2 1 2 0 2 1 -2 3 end data. REGRESSION /DEPENDENT y /METHOD=ENTER x1 . REGRESSION /DEPENDENT y /METHOD=ENTER x1 x2 . As you see, there are some non-linearities involved... Greetings Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peters Gj (PSYCHOLOGY) Sent: Wednesday, August 16, 2006 11:09 AM To: [hidden email] Subject: Re: Interesting phenomenon Dear Jan (& list), >> The first model contains only one predictor. In the second model, a >> second predictor is added. The beta of the first predictor is higher >> in the second model. > My first suspection is that there are some missings involved. Look at > this example: > [..] > Missings in x2 "shadow" cases where x1 is not well correlated with y. > So Beta1 is much higher in the second regression, but there are less > cases there. This sounds very plausible. Are you implying by offering this solution that it is not possible that there is a dataset in which the association between X1 and Y is stronger when you control for X2, then when you do not control for X2? In any case, thank you very much for your answer! My colleague was quite relieved :-) (still, I find it interesting whether this phenomenon it would be possible without missing cases :-)) Kind regards, GjY ________________________________________________________________________ ____ Gjalt-Jorn Ygram Peters ## Phd. Student ## Visit: Department of Experimental Psychology Room 3.004 Faculty of Psychology Universiteitssingel 5 University of Maastricht 6229 ES Maastricht The Netherlands ## Contact: P.O. Box 616 ## Private: 6200 MD Maastricht Alexander Battalaan 68d The Netherlands 6221 CE Maastricht The Netherlands Phone: +31 43 388 4508 Fax: +31 43 388 4211 Phone: +31 (0) 62 120 0009 Mail: [hidden email] Mail: [hidden email] Msn: [hidden email] Web: http://interventionmapping.nl Web: http://www.gjyp.nl |
In reply to this post by Peters Gj (PSYCHOLOGY)
I think that the following data set shows that this can happen even
without any missing data. David Hitchin data list list / x1 x2 y. begin data. 3,7,10 7,4,11 4,8,12 8,5,13 5,9,14 9,6,15 6,10,16 10,7,17 7,11,18 11,8,19 8,12,20 end data. REGRESSION /DEPENDENT y /METHOD=ENTER x1 . REGRESSION /DEPENDENT y /METHOD=ENTER x1 x2 . Quoting "Peters Gj (PSYCHOLOGY)" <[hidden email]>: > Dear Jan (& list), > > >> The first model contains only one predictor. In the second model, > a > >> second predictor is added. The beta of the first predictor is > higher > >> in the second model. > > > My first suspection is that there are some missings involved. Look > > at this example: > > [..] > > Missings in x2 "shadow" cases where x1 is not well correlated with > y. > > So Beta1 is much higher in the second regression, but there are > less > > cases there. > > This sounds very plausible. Are you implying by offering this > solution > that it is not possible that there is a dataset in which the > association > between X1 and Y is stronger when you control for X2, then when you > do > not control for X2? > In any case, thank you very much for your answer! My colleague was > quite > relieved :-) (still, I find it interesting whether this phenomenon > it > would be possible without missing cases :-)) > > Kind regards, > > GjY > > ____ > Gjalt-Jorn Ygram Peters > > ## Phd. Student ## Visit: > Department of Experimental Psychology Room 3.004 > Faculty of Psychology Universiteitssingel 5 > University of Maastricht 6229 ES Maastricht > The Netherlands > ## Contact: > P.O. Box 616 ## Private: > 6200 MD Maastricht Alexander Battalaan > 68d > The Netherlands 6221 CE Maastricht > The Netherlands > Phone: +31 43 388 4508 > Fax: +31 43 388 4211 Phone: +31 (0) 62 120 > 0009 > Mail: [hidden email] > Mail: [hidden email] Msn: [hidden email] > Web: http://interventionmapping.nl Web: > http://www.gjyp.nl > |
Hey Jan, David & list,
[summary] >GjY>> Are you implying by offering this solution that it is not >GjY>> possible that there is a dataset in which the association >GjY>> between X1 and Y is stronger when you control for X2, then >GjY>> when you do not control for X2? >David>> I think that the following data set shows that this can >David>> happen even >David>> without any missing data. >David>> [...] >Jan>> The phenomemon is possible even without missing values, but IMHO >Jan>> not very common in practice. Look at this example: >Jan>> [...] >Jan>> As you see, there are some non-linearities involved... [/summary] Thank you both, this is very clarifying! And somewhat disturbing. It does mean that my colleague should either ask the authors whether there is any difference in the used sample size between the analyses, or whether it's a rare situation like those you showed . . . Kind regards, GjY ________________________________________________________________________ ____ Gjalt-Jorn Ygram Peters ## Phd. Student ## Visit: Department of Experimental Psychology Room 3.004 Faculty of Psychology Universiteitssingel 5 University of Maastricht 6229 ES Maastricht |
In reply to this post by Peters Gj (PSYCHOLOGY)
If the difference between betas is really huge then it is wort
investigating, I think. Otherwise, it may be just by chance. Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peters Gj (PSYCHOLOGY) Sent: Wednesday, August 16, 2006 2:30 PM To: [hidden email] Subject: Re: Interesting phenomenon Hey Jan, David & list, [summary] >GjY>> Are you implying by offering this solution that it is not >GjY>> possible that there is a dataset in which the association >GjY>> between X1 and Y is stronger when you control for X2, then >GjY>> when you do not control for X2? >David>> I think that the following data set shows that this can >David>> happen even >David>> without any missing data. >David>> [...] >Jan>> The phenomemon is possible even without missing values, but IMHO >Jan>> not very common in practice. Look at this example: >Jan>> [...] >Jan>> As you see, there are some non-linearities involved... [/summary] Thank you both, this is very clarifying! And somewhat disturbing. It does mean that my colleague should either ask the authors whether there is any difference in the used sample size between the analyses, or whether it's a rare situation like those you showed . . . Kind regards, GjY ________________________________________________________________________ ____ Gjalt-Jorn Ygram Peters ## Phd. Student ## Visit: Department of Experimental Psychology Room 3.004 Faculty of Psychology Universiteitssingel 5 University of Maastricht 6229 ES Maastricht |
In reply to this post by Peters Gj (PSYCHOLOGY)
Hello,
It is possible to have a greater raw (or standardized) regression coefficient in a 3 parameters model (i.e., Y = b0 + b1X1 + b2X2) than in a two parameters model (i.e., Y = b0 + b1X1). The reason is that the proportion of variance accounted for by X2 may hide the smaller one accounted for by X1. If this is the case, X2 is called a suppressor variable. In a multiple regression analysis, a coefficient represents the relation between the dependant variable and a predictor HOLDING ALL OTHER PREDICTORS CONSTANT. That is, when you regress Y on X1 and X2, the X1 coefficient stands for the relationship between Y (partialled out for X2) and X1 (Partialled out for X2). Let's take a real example. The following dataset represents unemployement in the united states (UN) between 1950 and 1959 (YEAR), the federal reserve board index of industrial production (IP) and a code representing the different years (YR). These data were presented by Velleman and Welsch (1981, p.235) and used as an example in Judd and McClelland (1989, p.189). One could hypothesize a negative relationship between unemployement and industrial productivity. When industrial productivity increases, unemployment may logically decrease. When regressing UN on IP however, there is no relationship between these two variables (not significant) and, surprisingly, the raw (or standardized) coefficient is positive (B = .021, p = .379). (step 1) When regressing UN on YR, we observe a positive relationship between unemployement and YR (B=.208, p=.04). That is, as YR increases, UN increases. Furthermore, YR acccounts for a substantial amount of variance in UN (R²=.428).(step 2) As there is a positive relationship between UN and YR, there may also be a positive relationship between YR and IP (i.e., industrial production increased linearly between 1950 and 1959). (step 2') As we can see, there is a positive relationship between YR and IP (B=4.364, p<.001). (This may reveal a problem of collinearity, which increases the difficulty of interpreting the coefficients). However, taking into account both YR and IP in our model 3 leads to a totally different conclusion compared to our model 1. There is a strong negative relationship between UN and IP that was masked by the yearly changes in UN. Take a look at the raw regression coefficients in our model 3. The relation between UN and IP nonetheless became significant, but also the value of the coefficient increased. To demonstrate that the coefficient in multiple regression stand for the relationship between the dependant and one independant variable partialling out for all other independant variables, I added a step in my demonstration (STEP 4). In this step, I used regression analysis to control for the relationship between UN and YR, IP and YR, and UN and IP, saved the residuals (i.e. the proportion of variance in the dependant that is not accounted for by the independant). UN_YR represents UN partialling out from YR, IP_YR represents IP partialling out from YR, UN_IP represents UN partialling out from IP, and YR_IP represents YR partialling out from IP. I performed 2 different regression analysis, I first regressed UN_YR on IP_YR, and then I regressed UN_IP on YR_IP. Take a look at the coefficients. When UN_YR is regressed on IP_YR, the raw coefficient corresponding to IP_YR equals exactly the coefficient for IP in our STEP 3 regression analysis. Equivalently, when UN_IP is regressed on YR_IP, the raw coefficient B1 equals exactly the coefficient for YR in our STEP 3 analysis. data list list / year un ip yr. begin data. 1950 3.1 113 1 1951 1.9 123 2 1952 1.7 127 3 1953 1.6 138 4 1954 3.2 130 5 1955 2.7 146 6 1956 2.6 151 7 1957 2.9 152 8 1958 4.7 141 9 1959 3.8 159 10 end data. *STEP 1* REGRESSION /DEPENDENT un /METHOD=ENTER ip . *STEP 2* REGRESSION /DEPENDENT un /METHOD=ENTER yr . *STEP 2'* REGRESSION /DEPENDENT ip /METHOD=ENTER yr . *STEP 3* REGRESSION /DEPENDENT un /METHOD=ENTER ip yr . *STEP4* REGRESSION /DEPENDENT un /METHOD=ENTER yr /SAVE RESID . REGRESSION /DEPENDENT ip /METHOD=ENTER yr /SAVE RESID . REGRESSION /DEPENDENT yr /METHOD=ENTER ip /SAVE RESID . REGRESSION /DEPENDENT un /METHOD=ENTER ip /SAVE RESID . COMPUTE un_yr = RES_1 . COMPUTE ip_yr = RES_2 . COMPUTE yr_ip = RES_3 . COMPUTE un_ip = RES_4 . EXE. REGRESSION /DEPENDENT un_yr /METHOD=ENTER ip_yr . REGRESSION /DEPENDENT un_ip /METHOD=ENTER yr_ip . |
Dear Fabrice,
>>Fabrice> It is possible to have a greater raw (or standardized) >>Fabrice> regression coefficient in a 3 parameters model (i.e., >>Fabrice> Y = b0 + b1X1 + b2X2) than in a two parameters model >>Fabrice> (i.e., Y = b0 + b1X1). The reason is that the proportion >>Fabrice> of variance accounted for by X2 may hide the smaller one >>Fabrice> accounted for by X1. If this is the case, X2 is called a >>Fabrice> suppressor variable. >>Fabrice> [..] Thank you for the explanation, again very clear! Both me and the colleague who initially asked about this topic are very grateful to you and all who helped. Have a good weekend, kind regards, GjY ___________________________________________________________________ Gjalt-Jorn Ygram Peters ## Phd. Student Department of Experimental Psychology Faculty of Psychology, University of Maastricht, The Netherlands ## Contact: P.O. Box 616 Phone: +31 43 388 4508 6200 MD Maastricht, The Netherlands Msn: [hidden email] ___________________________________________________________________ |
Free forum by Nabble | Edit this page |