I wanted to probe a significant moderation effect and I hit a wall. I’ve never worked with this type of problem before: I have 4 dummy variables and three measures of a construct. I centered my measures (not the Y). I looked at my results, I looked at SPSS and my brain froze.
I’ve laid out my notes below for the Ethnicity X Scale (X3) effect. Any help or suggestions will be greatly appreciated. Stephen Salbod, Pace University, NYC MODEL: Microaggression (Y) = Ethnicity + Ethnic Identity + Ethnicity*Ethnic Identity Ethnicity is indicated by membership in one of five ethnic groups. Four dummy variables (D1-D4) were used to represent it in the analysis. Ethnic Identity is measured on three subscales: explo(X1), resol(X2), and affir(X3). These measures were mean centered. MODEL: Y’ = b0 + b1(D1) + b2(D2) + b3(D3) + b4(D4) + b5(X1) + b6(X2) + b7(X3) + b8(D1X1) + b9(D2X1) + b10(D3X1) + b11(D4X1) + b12(D1X2)+ b13(D2X2) + b14(D3X2) + b15(D4X2) + b16(D1X3)+ b17(D2X3) + b18(D3X3) + b19(D4X3) The analysis revealed the interactions b10D3X1, b18D3X3 and b19D4X3 to be statistically significant. FINAL EQUATION Y’ = .97 -.13*(D1) + .02(D2) -.15(D3) -.31**(D4) + .00(X1) + .00(X2) + .03(X3) -.02(D1X1) + .00(D2X1) -.05*(D3X1) -.01(D4X1) + .01(D1X2)+ .01(D2X2) + .01(D3X2) + .01(D4X2) -.04(D1X3) -.03(D2X3) + .22**(D3X3) + .21**(D4X3) * p < .05. ** p < .01. SIMPLE SLOPES for Ethnicity X Scale(X3). DummyCodes Simple Ethnic Groups Intercept + Slope*(X3) 0 0 0 0 b0 + b7 (X3) 0 0 0 1 (b0 + b4 )+ (b7 + b19 )(X3) 0 0 1 0 (b0 + b3) + (b7 + b18 )(X3) 0 1 0 0 (b0 + b2) + (b7 + b17) (X3) 1 0 0 0 (b0 + b1) + (b7 + b16 )(X3) To test the simple slopes (simple b != 0) I plan to use this code: GLM rems BY ethnic WITH affir explor resol /DESIGN=ethnic affir*ethnic explor*ethnic resol*ethnic /PRINT = PARAMETER. This is just an extension of a test for interactions in ANCOVA. Am I missing something? This solution looks too easy |
I think I am understanding your question and I think the following references (posted by Cam McIntosh on semnet) will be helpful. To begin it might be helpful to plot predicted values by group for specific interaction terms. The result is two lines that may or may not intersect within the possible range of the EI variable. The question is whether the predicted values differ for a given value of the EI variable. I haven't needed to do this for a while but I think the term used is 'regions of significance'. (There will be two because there's two EI values beyond which, in absolute terms, all greater values differ significantly.) Once you have visualized the relationships, you should be able to use the emmeans subcommand to search for the boundary points to the significance regions.
However, your actual situation is more complicated because two EI variables are involved in significant interactions. I've never worked a situation like yours and I don't know that I have ever seen a message about such a situation. Perhaps others on the list have and can offer guidance and references. Gene Maguin Hayes, A.F., & Matthes, J. (2009). Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations. Behavior Research Methods, 41, 924-936. http://www.personal.psu.edu/jxb14/M554/articles/Hayes&Matthes2009.pdf Holmbeck, G.N. (2002). Post-hoc Probing of Significant Moderational and Mediational Effects in Studies of Pediatric Populations. Journal of Pediatric Psychology, 27(1), 87-96. http://jpepsy.oxfordjournals.org/content/27/1/87.full.pdf Bauer, D. J., & Curran, P. J. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40(3), 373-400. http://www.unc.edu/~dbauer/manuscripts/bauer-curran-MBR-2005.pdf Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Computational tools for probing interaction effects in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics, 31(4), 437-448. http://www.unc.edu/~dbauer/manuscripts/preacher-curran-bauer-2006.pdf http://www.people.ku.edu/~preacher/interact/index.html Francoeur, R.B. (2011). Interpreting interactions of ordinal or continuous variables in moderated regression using the zero slope comparison: tutorial, new extensions, and cancer symptom applications. International Journal of Society Systems Science, 3(1/2), 137-158. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Salbod Sent: Sunday, May 05, 2013 2:54 PM To: [hidden email] Subject: Probing category X continuous interaction I wanted to probe a significant moderation effect and I hit a wall. I’ve never worked with this type of problem before: I have 4 dummy variables and three measures of a construct. I centered my measures (not the Y). I looked at my results, I looked at SPSS and my brain froze. I’ve laid out my notes below for the Ethnicity X Scale (X3) effect. Any help or suggestions will be greatly appreciated. Stephen Salbod, Pace University, NYC MODEL: Microaggression (Y) = Ethnicity + Ethnic Identity + Ethnicity*Ethnic Identity Ethnicity is indicated by membership in one of five ethnic groups. Four dummy variables (D1-D4) were used to represent it in the analysis. Ethnic Identity is measured on three subscales: explo(X1), resol(X2), and affir(X3). These measures were mean centered. MODEL: Y’ = b0 + b1(D1) + b2(D2) + b3(D3) + b4(D4) + b5(X1) + b6(X2) + b7(X3) + b8(D1X1) + b9(D2X1) + b10(D3X1) + b11(D4X1) + b12(D1X2)+ b13(D2X2) + b14(D3X2) + b15(D4X2) + b16(D1X3)+ b17(D2X3) + b18(D3X3) + b19(D4X3) The analysis revealed the interactions b10D3X1, b18D3X3 and b19D4X3 to be statistically significant. FINAL EQUATION Y’ = .97 -.13*(D1) + .02(D2) -.15(D3) -.31**(D4) + .00(X1) + .00(X2) + .03(X3) -.02(D1X1) + .00(D2X1) -.05*(D3X1) -.01(D4X1) + .01(D1X2)+ .01(D2X2) + .01(D3X2) + .01(D4X2) -.04(D1X3) -.03(D2X3) + .22**(D3X3) + .21**(D4X3) * p < .05. ** p < .01. SIMPLE SLOPES for Ethnicity X Scale(X3). DummyCodes Simple Ethnic Groups Intercept + Slope*(X3) 0 0 0 0 b0 + b7 (X3) 0 0 0 1 (b0 + b4 )+ (b7 + b19 )(X3) 0 0 1 0 (b0 + b3) + (b7 + b18 )(X3) 0 1 0 0 (b0 + b2) + (b7 + b17) (X3) 1 0 0 0 (b0 + b1) + (b7 + b16 )(X3) To test the simple slopes (simple b != 0) I plan to use this code: GLM rems BY ethnic WITH affir explor resol /DESIGN=ethnic affir*ethnic explor*ethnic resol*ethnic /PRINT = PARAMETER. This is just an extension of a test for interactions in ANCOVA. Am I missing something? This solution looks too easy -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Probing-category-X-continuous-interaction-tp5719978.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Salbod
Stephen, You want to test if the group-specific slopes of each x is significantly different from zero, correct? If yes, you need to remove the grand intercept from the model you posted, and the parameter estimates table will provide you with the 5 group-specific intercepts, and more importantly, the 5x3=15 slope tests you desire. The MIXED code would look like this:
MIXED y by group with x1 x2 x3 /FIXED=group group*x1 group*x2 group*x3 | noint SSTYPE(3) /PRINT SOLUTION /METHOD=REML. This is merely a different parameterization of the same model:
MIXED y by group with x1 x2 /FIXED=group x1 x2 group*x1 group*x2 | SSTYPE(3) /PRINT SOLUTION /METHOD=REML. If you wanted to employ the UNIANOVA procedure, then the following code would work:
UNIANOVA y BY group WITH x1 x2 x3 /METHOD=SSTYPE(3) /INTERCEPT=EXCLUDE /PRINT=PARAMETER /CRITERIA=ALPHA(.05) /DESIGN=group group*x1 group*x2. group*x3 If you wanted to employ the GLM procedure (which is what you posted), then the following code would work:
GLM y BY group WITH x1 x2 x3 /INTERCEPT=EXCLUDE /DESIGN=group group*x1 group*x2 group*x3 /PRINT=PARAMETER. HTH Ryan
On Sun, May 5, 2013 at 2:53 PM, Salbod <[hidden email]> wrote: I wanted to probe a significant moderation effect and I hit a |
My fingers slipped and sent off the message before I was able to clean up what I posted. Stephen, please read this post instead... Stephen, You want to test if the group-specific slopes of each x is significantly different from zero, correct? If yes, you need to remove the grand intercept from the model you posted, and the parameter estimates table will provide you with the 5 group-specific intercepts, and more importantly, the 5x3=15 slope tests you desire. The MIXED code would look like this:
MIXED y by group with x1 x2 x3 /FIXED=group group*x1 group*x2 group*x3 | noint SSTYPE(3) /PRINT SOLUTION /METHOD=REML. This is merely a different parameterization of the same model:
MIXED y by group with x1 x2 x3 /FIXED=group x1 x2 x3 group*x1 group*x2 group*x3 | SSTYPE(3) /PRINT SOLUTION /METHOD=REML. If you wanted to employ the UNIANOVA procedure to test the group-specific slopes of each x, then the following code would work:
UNIANOVA y BY group WITH x1 x2 x3 /METHOD=SSTYPE(3) /INTERCEPT=EXCLUDE /PRINT=PARAMETER /CRITERIA=ALPHA(.05) /DESIGN=group group*x1 group*x2 group*x3. If you wanted to employ the GLM procedure (which is what you posted) to test the group-specific slopes of each x, then the following code would work:
GLM y BY group WITH x1 x2 x3 /INTERCEPT=EXCLUDE /DESIGN=group group*x1 group*x2 group*x3 /PRINT=PARAMETER. HTH Ryan On Sun, May 5, 2013 at 6:21 PM, Ryan Black <[hidden email]> wrote:
|
Stephen, It is probably worth pointing out that what you are asking for are the simple effects of x1, x2, and x3 [within each level of group]. These simple effects are in terms of slopes. Often, it is of interest to test the simple effects of group [across a range values of each x (e.g., x1_min, x1_mean, x_1 max), usually making sure not to extrapolate]. By simple effects of group, I am referring to mean differences between groups at specific values of each x. Ryan
|
In reply to this post by Ryan
Ryan, Thank you so much; you have saved me from the dreaded: recode the reference group to test simple slopes.
I’m sorry about the delay in responding to your reply. I had peeled another layer off this onion and discovered that the outcome variable was a proportion based
on a checklist (45 items). Participants checked yes (=1) to item if they experienced it and no (=0) if they didn’t. I could kick myself, I should have explored all the variables initially. I only discovered this when I was creating a reduced dataset (2 ethnicities
and 2 ethnic Identity measures) to play with your code. I know proportions present a problem for OLS regression. Is it possible that I could avoid switching to one of those other models (e.g., beta regression) if I used the sum as opposed to the mean? --Steve From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Ryan Black My fingers slipped and sent off the message before I was able to clean up what I posted. Stephen, please read this post instead... Stephen, You want to test if the group-specific slopes of each x is significantly different from zero, correct? If yes, you need to remove the grand intercept from the model you posted, and the parameter estimates table will provide you with the
5 group-specific intercepts, and more importantly, the 5x3=15 slope tests you desire. The MIXED code would look like this: MIXED y by group with x1 x2 x3 This is merely a different parameterization of the same model:
If you wanted to employ the UNIANOVA procedure to test the group-specific slopes of each x, then the following code would work:
If you wanted to employ the GLM procedure (which is what you posted) to test the group-specific slopes of each x, then the following code would work:
HTH Ryan On Sun, May 5, 2013 at 6:21 PM, Ryan Black <[hidden email]> wrote: Stephen, You want to test if the group-specific slopes of each x is significantly different from zero, correct? If yes, you need to remove the grand intercept from the model you posted, and the parameter estimates table will provide you with the
5 group-specific intercepts, and more importantly, the 5x3=15 slope tests you desire. The MIXED code would look like this: MIXED y by group with x1 x2 x3 This is merely a different parameterization of the same model:
If you wanted to employ the UNIANOVA procedure, then the following code would work:
If you wanted to employ the GLM procedure (which is what you posted), then the following code would work:
HTH Ryan On Sun, May 5, 2013 at 2:53 PM, Salbod <[hidden email]> wrote: I wanted to probe a significant moderation effect and I hit a |
Steve, Since the outcome is a proportion, fit a binomial logistic regression model using GENLIN, where the number endorsed is the numerator and the denominator is 45. Aside from that, you should set up the model as I showed you previously. Why not just treat the numerator as the outcome and employ an OLS regression? For one, it is on the closed interval: [0, 45]. With covariates, you could end up with predicted values outside of that interval; that is, you could obtain y_hats < 0 and/or > 45, which are not interpretable, right? There are all sorts of questions I would normally ask that could lead me to make entirely different suggestions (e.g., different way to score outcome, different model), but no time. Ryan On Wed, May 8, 2013 at 8:41 AM, Salbod, Mr. Stephen <[hidden email]> wrote:
|
Hi Ryan, Thank you for taking the time to provide thoughtful explanation. I hear what you saying: deal with the proportions. However, I’m not clear about
the second paragraph, especially the last two sentences. I’ll probably come around to your thinking, but right now, something doesn’t seem right. Warm regards, Steve
From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Ryan Black Steve, Since the outcome is a proportion, fit a binomial logistic regression model using GENLIN, where the number endorsed is the numerator and the denominator is 45. Aside from that, you should set up the model as I showed you previously. Why not just treat the numerator as the outcome and employ an OLS regression? For one, it is on the closed interval: [0, 45]. With covariates, you could end up with predicted values outside of that interval; that is, you could obtain y_hats
< 0 and/or > 45, which are not interpretable, right? There are all sorts of questions I would normally ask that could lead me to make entirely different suggestions (e.g., different way to score outcome, different model), but no time. Ryan On Wed, May 8, 2013 at 8:41 AM, Salbod, Mr. Stephen <[hidden email]> wrote: Ryan, Thank you so much; you have saved me from the dreaded: recode the reference group to test simple
slopes. I’m sorry about the delay in responding to your reply. I had peeled another layer off this onion
and discovered that the outcome variable was a proportion based on a checklist (45 items). Participants checked yes (=1) to item if they experienced it and no (=0) if they didn’t. I could kick myself, I should have explored all the variables initially. I
only discovered this when I was creating a reduced dataset (2 ethnicities and 2 ethnic Identity measures) to play with your code. I know proportions present a problem for OLS regression. Is it possible that I could avoid switching to one of those other models
(e.g., beta regression) if I used the sum as opposed to the mean? --Steve From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Ryan Black My fingers slipped and sent off the message before I was able to clean up what I posted. Stephen, please read this post instead... Stephen, You want to test if the group-specific slopes of each x is significantly different from zero, correct? If yes, you need to remove the grand intercept from the model you posted,
and the parameter estimates table will provide you with the 5 group-specific intercepts, and more importantly, the 5x3=15 slope tests you desire. The MIXED code would look like this: MIXED y by group with x1 x2 x3 This is merely a different parameterization of the same model:
If you wanted to employ the UNIANOVA procedure to test the group-specific slopes of each x, then the following code would work:
If you wanted to employ the GLM procedure (which is what you posted) to test the group-specific slopes of each x, then the following code would work:
HTH Ryan On Sun, May 5, 2013 at 6:21 PM, Ryan Black <[hidden email]> wrote: Stephen, You want to test if the group-specific slopes of each x is significantly different from zero, correct? If yes, you need to remove the grand intercept from the model you posted,
and the parameter estimates table will provide you with the 5 group-specific intercepts, and more importantly, the 5x3=15 slope tests you desire. The MIXED code would look like this: MIXED y by group with x1 x2 x3 This is merely a different parameterization of the same model:
If you wanted to employ the UNIANOVA procedure, then the following code would work:
If you wanted to employ the GLM procedure (which is what you posted), then the following code would work:
HTH Ryan On Sun, May 5, 2013 at 2:53 PM, Salbod <[hidden email]> wrote: I wanted to probe a significant moderation effect and I hit a |
Steve, Run the syntax below my name, and take a close look at the predicted values from the OLS regression model. Curious. How would you interpret predicted values from the model that are less than 0 or greater than 45? Do you think OLS regression is appropriate?
Next, take a look at the values [on the original scale] predicted from the binomial logistic regression model. Notice how they all fall within the range of possible values. Here's additional information I posted a while back regarding logistic regression:
Clear now? Ryan p.s. the purpose of this demonstration, along with the link posted above, is not to imply that the binomial logistic regerssion model is necessarily optimal for your data, but it is expected to help you achieve clarity as to why OLS regression is often problematic in the presence of continuous predictors.
-- *Generate Data. set seed 98765432. new file. input program. loop ID= 1 to 1000. compute x1 = rv.normal(0,1). compute x2 = rv.normal(0,1). compute #b0 = 1.5. compute #b1 = 4.2. compute #b2 = -0.5. compute #eta = #b0 + #b1*x1 + #b2*x2. compute #prob = exp(#eta) / (1+ exp(#eta)). compute y = rv.binomial(45,#prob). end case. end loop. end file. end input program. exe. *Linear Regression Model. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT y /METHOD=ENTER x1 x2 /SAVE PRED. SORT CASES BY PRE_1(A). * Binomial Logistic Regression Model. GENLIN y OF 45 WITH x1 x2 /MODEL x1 x2 INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION /SAVE MEANPRED. On Wed, May 8, 2013 at 7:25 PM, Salbod, Mr. Stephen <[hidden email]> wrote:
|
On the one hand, a logistic link could be the most precise analysis available.
Once you get used to it, you may prefer the information it gives you. On the other hand, it could be overkill, in terms of complication, and the hazard of predicting outside of the range, for a criterion that is a summative scale, is almost zero. - Do you have R^2 over 0.75? - Do you have a lot of observed scores that are near 0 and 45? If both those conditions are true, you do want to use the logistic model. If your scores are near the middle of the range, you will probably be well- served by using the actual score (or proportion) for the outcome in OLS. For mid-range scores, I think no one would criticize that choice. If your scores are near one end, then (treating the populated end as zero) they are apt to be distributed rather as Poisson. For Poisson, an appropriate transformation for the criterion is to take the square root of the counts (or use a link function for that). Directly taking the square root leaves you with a simple OLS regression; the simplicity is the main thing to recommend that choice. It does get awkward again if you want to look at things like predicted values. -- Rich Ulrich Date: Wed, 8 May 2013 21:07:48 -0400 From: [hidden email] Subject: Re: Probing category X continuous interaction To: [hidden email] Steve, Run the syntax below my name, and take a close look at the predicted values from the OLS regression model. Curious. How would you interpret predicted values from the model that are less than 0 or greater than 45? Do you think OLS regression is appropriate?
Next, take a look at the values [on the original scale] predicted from the binomial logistic regression model. Notice how they all fall within the range of possible values. Here's additional information I posted a while back regarding logistic regression:
Clear now? Ryan p.s. the purpose of this demonstration, along with the link posted above, is not to imply that the binomial logistic regerssion model is necessarily optimal for your data, but it is expected to help you achieve clarity as to why OLS regression is often problematic in the presence of continuous predictors.
... [snip, example and previous] |
Hi Rich, Thank you for pulling off the data. I’ve been following up on Ryan’s suggestions—use GENLIN to do binomial logistic regression. I’m new to GENLIN so the procedure
is all new (and confusing) to me. (Do you know any good resources?) The data I’m working with is an SPSS file from Qualtrics (e.g., Survey Monkey). It has not been finalized, but I do have enough of the data to work on. If anything the dataset will be reduced.
The outcome variable, called remCount has a mean = 12.3 (out of 45), Skewness (SE) = 1.85 (.20), and Kurtosis (SE) = 4.75 (.40). The proportions ranged from .11 to .78. I thought about the Poisson, but, at least to me, the data doesn’t look it. A question
I have is how concerned do I need to be about overdispersion: if I go with the binominal logistic regression? Regards, Steve From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Rich Ulrich On the one hand, a logistic link could be the most precise analysis available. Date: Wed, 8 May 2013 21:07:48 -0400 Steve, Run the syntax below my name, and take a close look at the predicted values from the OLS regression model. Curious. How would you interpret predicted values from the model that are less than
0 or greater than 45? Do you think OLS regression is appropriate? Next, take a look at the values [on the original scale] predicted from the binomial logistic regression model. Notice how they all fall within the range of possible values. Here's additional information I posted a while back regarding logistic regression: Clear now? Ryan p.s. the purpose of this demonstration, along with the link posted above, is not to imply that the binomial logistic regerssion model is necessarily optimal for your data, but it is expected
to help you achieve clarity as to why OLS regression is often problematic in the presence of continuous predictors.
... |
You say your question is: How concerned to you need to be about
over-dispersion if you go with binomial LR? - That's something you worry about when you see it, or you worry about it when you know that you have an excess (say) of zeros. For these data, with no zeroes, it doesn't promise to be a problem. I will mention a couple of guidelines for considering LR vs the alternatives I mentioned in my previous note. Proportions have pretty good homogeneity of error between 0.20 and 0.80; you have a bunch of data below 0.20, so that simplest model, the actual outcome score, has a little strike against it. That is further borne out by the coefficient of kurtosis -- I like to see it less than 0.4 or so. The Poisson assumption, which would justify taking a square root, is nicest when all your proportions are less than, say, half. But your minimum is 0.11, your mean is 0.27, and your maximum is 0.78. Taking the square root will effectively "de-weight" the importance of all the higher scores. While the variance is increasing - as you move toward 0.5 - that is approximately proper. For scores above 0.50, that becomes increasingly improper. For your min, max (0.11, 0.78) and mean 0.27, most of your scores must be less than 0.5 to keep the mean that low. I think I would look closer at the cases that are higher than 0.5 and decide whether it is safe to under-weight them. -- Rich Ulrich Date: Fri, 10 May 2013 23:04:33 +0000 From: [hidden email] Subject: Re: Probing category X continuous interaction To: [hidden email] Hi Rich, Thank you for pulling off the data. I’ve been following up on Ryan’s suggestions—use GENLIN to do binomial logistic regression. I’m new to GENLIN so the procedure is all new (and confusing) to me. (Do you know any good resources?) The data I’m working with is an SPSS file from Qualtrics (e.g., Survey Monkey). It has not been finalized, but I do have enough of the data to work on. If anything the dataset will be reduced. The outcome variable, called remCount has a mean = 12.3 (out of 45), Skewness (SE) = 1.85 (.20), and Kurtosis (SE) = 4.75 (.40). The proportions ranged from .11 to .78. I thought about the Poisson, but, at least to me, the data doesn’t look it. A question I have is how concerned do I need to be about overdispersion: if I go with the binominal logistic regression?
Regards, Steve
From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Rich Ulrich
On the one hand, a logistic link could be the most precise analysis available. Date: Wed, 8 May 2013 21:07:48 -0400 Steve,
Run the syntax below my name, and take a close look at the predicted values from the OLS regression model. Curious. How would you interpret predicted values from the model that are less than 0 or greater than 45? Do you think OLS regression is appropriate?
Next, take a look at the values [on the original scale] predicted from the binomial logistic regression model. Notice how they all fall within the range of possible values.
Here's additional information I posted a while back regarding logistic regression:
Clear now?
Ryan
p.s. the purpose of this demonstration, along with the link posted above, is not to imply that the binomial logistic regerssion model is necessarily optimal for your data, but it is expected to help you achieve clarity as to why OLS regression is often problematic in the presence of continuous predictors. ... |
In reply to this post by Rich Ulrich
Hi Rich, Using GENLIN I did a binomial with a logit link (as Ryan Black suggested) and I got some interesting results (i.e., interactions) but also I
had a Deviance Value = 6.56. As I understand it, anything above a 2.0 is a problem. I don’t know what to make of it other than it’s reflecting a problem with a mean/variance relationship.
I tried Poisson with a log link and again I got some interesting results, but also, I had a Deviance Value = 4.59. A Negative Binomial yielded Deviance Value = .54. But, the SEs increased substantially. The earlier findings disappeared. I tried two transformations of the proportions: square root and arcsin(square root). The arcsin gave me a mean = .46, Skew = .89, and K = 1.15.
The square root was more extreme. Using code that Ryan Black’s provided I analyzed the arcsin variable. None of the results were interesting.
I wondering are the earlier findings an artifact of the high deviance value or is it possible to adjust the binomial or Poisson for the high
deviance value? --Steve
From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Rich Ulrich On the one hand, a logistic link could be the most precise analysis available. Date: Wed, 8 May 2013 21:07:48 -0400 Steve, Run the syntax below my name, and take a close look at the predicted values from the OLS regression model. Curious. How would you interpret predicted values from the model that are less than
0 or greater than 45? Do you think OLS regression is appropriate? Next, take a look at the values [on the original scale] predicted from the binomial logistic regression model. Notice how they all fall within the range of possible values. Here's additional information I posted a while back regarding logistic regression: Clear now? Ryan p.s. the purpose of this demonstration, along with the link posted above, is not to imply that the binomial logistic regerssion model is necessarily optimal for your data, but it is expected
to help you achieve clarity as to why OLS regression is often problematic in the presence of continuous predictors.
... |
Steve, I just came across this post and the previous post. A few brief remarks: By specifying scale = deviance or scale = pearson in the binomial logistic regression employed via GENLIN, you are accounting for overdispersion. The standard errors should be adjusted accordingly. Having said that, overdispersion can be indicative of an incorrectly specified model due to issues such as excess zeros and/or correlated observations.
If the event (where event = "success" = endorsement) is rare with a relatively large n (where n = "number of trials" = number of items on your checklist), then the Poisson regression with an offset (where offset = log of n) will yield similar results to the Binomial regression. Since your event is *not* rare with a relatively large n, then the models will likely produce dissimilar results, including fitted values. Speaking of which, by specifying a Poisson regression, you could end up with fitted values that surpass 45. Recall that the Poisson distribution ranges from 0 to (+) infinity. Same goes with a negative binomial distribution. From what I have read thus far, I would be disinclined to recommend a Poisson or Negative Binomial regression.
I still have many questions that cannot be fully answered on SPSS-L that, as I said previously, could lead me to recommend a very different analytic strategy, potentially involving a psychometric evaluation of the responses to the items on this checklist. But I prefer not to elaborate further on this point.
If you decide to employ a binomial logistic regression (or whatever regression you decided to use, for that matter), be certain that the assumptions are tenable and that you are able to interpret the partial slopes.
Ryan On Mon, May 13, 2013 at 11:50 AM, Salbod, Mr. Stephen <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |