Generalized Linear Model - Moderation Analysis

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Generalized Linear Model - Moderation Analysis

analyze28
Hi all,

I'm really hoping that someone can help me with this as I'm reaching the end
of my tether!  I am conducting a moderation analysis on cross-sectional
data.  I have been advised to run a series of GZLM in order to achieve this.
This is as follows:

Model 1: All covariates
Model 2: Model 1 plus main effects
Model 3: Model 2 plus resilience
Model 4: Model 3 plus interaction between risk factors and resilience.

Now to begin with - I can run the GZLM up to model 4 (I think, well I have
done but have yet to interpret the results.  Any tips??).  The issues I am
having is with creating the interaction.  I know how to create an
interaction between, say, tea*biscuits.  However, I have several risk
factors, so it is about creating an interaction term between -
tea/coffee/water/soft drinks/green tea/black tea*biscuits.  Does anyone know
how to do this?  I've searched but can not find anything to help direct me
in regards to this.

Further, my data consists of factors and covariates, so I don't know whether
this complicates things further.  I have saved the XBPredicted and
MeanPredicted within the model as well, I'm not sure if that will help.

I have conducted MI prior to running the analysis on the imputed data
(though my supervisor thinks I need to re do the missing analysis now with
EM and then run the GZLM - advice?).

Any suggestions will be gratefully received.

Desperately yours...

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Generalized-Linear-Model-Moderation-Analysis-tp5714027.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

Poes, Matthew Joseph
See my comments below:

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of analyze28
Sent: Thursday, July 05, 2012 2:35 AM
To: [hidden email]
Subject: Generalized Linear Model - Moderation Analysis

Hi all,

I'm really hoping that someone can help me with this as I'm reaching the end of my tether!  I am conducting a moderation analysis on cross-sectional data.  I have been advised to run a series of GZLM in order to achieve this.
This is as follows:

Model 1: All covariates
Model 2: Model 1 plus main effects *Main effects of what, are you talking about adding in the factors now?  Keep in mind that the factors are simply the categorical variables in the model, which end up being dummy coded by the program when it runs.  Covariates can be true covariates, or simply linear predictors.  Main effects simply refers to the effect of either a covariate or a categorical factor variable, when all else in the model is equal to zero.  However, you need to be certain that, in this scenario, zero reflects a relevant base case.  In other words, to get meaningful main effects, you need to center all of your covariates, and utilize appropriate coding of your categorical factor variables.
Model 3: Model 2 plus resilience *What type of variable is resilience?  If this is a linear variable, it will go into the covariate section, and to discuss its ability to explain additional variance in a meaningful way, you will want to show that the fit statistics went down when this was added, not just that it is a significant predictor in its own right.
Model 4: Model 3 plus interaction between risk factors and resilience.

Now to begin with - I can run the GZLM up to model 4 (I think, well I have done but have yet to interpret the results.  Any tips??).  The issues I am having is with creating the interaction.  I know how to create an interaction between, say, tea*biscuits.  However, I have several risk factors, so it is about creating an interaction term between - tea/coffee/water/soft drinks/green tea/black tea*biscuits.  Does anyone know how to do this?  I've searched but can not find anything to help direct me in regards to this.

*As for the interactions, they work the same as in any other model.  Just like in other models, adding lots of levels to your interactions becomes very complicated and difficult to interpret.  First, anything more than a 3-way interaction is going to be very complicated to interpret.  I can't imagine it giving you useful information either.  My recommendation is to first consider what research questions you are trying to answer, what theory underlies those questions, and then consider how the data can answer those questions.  In terms of just creating an interaction with the term "biscuits", you simply create tea*biscuits, coffee*biscuits, water*biscuits, soft drinks*biscuits, green tea*biscuits, and black tea*biscuits.  Include all of those in the model.  Remember again though, the interpretation of the effect coefficient is for when everything else in the model is zero.  In the example above, if you have tea, biscuits, and tea*biscuits in the model, then your interpretatio!
 n is the effect for tea but no biscuits, the effect for biscuits but no tea, and the effect for tea when someone also has a biscuit.  Here is where things get confusing, when all of those variables are in the model, it now becomes the effect for tea, but no biscuits, coffee, water, soft drinks, etc.  Remember that the intercept is going to be the value for your referent group, and thus all the effects will be interpreted as the difference from the referent group, not their actual effect.  With linear interactions you have two effects to consider, differences in intercept, and differences in slope.

Further, my data consists of factors and covariates, so I don't know whether this complicates things further.  I have saved the XBPredicted and MeanPredicted within the model as well, I'm not sure if that will help.
*Not really, as long as you understand fundamentally how to interpret the model.  Your saved values wont' really be useful for this.

I have conducted MI prior to running the analysis on the imputed data (though my supervisor thinks I need to re do the missing analysis now with EM and then run the GZLM - advice?).
*I'm not sure what you mean here by EM?  As for the MI, just be certain that the MI was appropriate to begin with.  I see a lot of people use it to handle missing data, even though the nature of the missing data is such as to suggest the MI increased bias of the estimates over list wise deletion.  You don't want to do that.  Also, if you have predictable missing data following a specific pattern (NMAR) I would suggest adding dummy variables for the cases which are missing that NMAR variable.  You would do this for each of the variables which reflect NMAR properties, and include all of these dummy variables as covariates.

Any suggestions will be gratefully received.

Desperately yours...

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Generalized-Linear-Model-Moderation-Analysis-tp5714027.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

analyze28
In reply to this post by analyze28
Hi Matthew,

Thanks for your quick reply, I appreciate it and it has helped to put things into context in my head.  With regards to the covariates and main effects in the models, I never thought to question it.  My supervisor informed me that this was what I needed to do and sent me the models as you have already seen below.  From clearing my head, the "covariates" will be the factors and covariates in Model 1.  

As for the main effects, I have already centred my variables and sorted my ordinal rating for the categorical variables so they are in the correct "direction."  Is it right to presume that I just select "main effects" in the drop down box and move them across to get this information?

The variable resilience at this moment in time, is being deliberated.  I am intending to run it as a categorical and as a continuous variable in two separate analysis to see if this makes a difference in interpretation etc.

I understand the issue with the interactions and am prepared to handle the haystack in deciphering it all :-)

As to the output of the GZLM, are there any recommended sources for interpreting the output?  I've searched around but not found anything that can actually give clear guidance.  Everything tells me how to run it, and although I have a good understanding of statistics, I've not run a GZLM before so am not 100% sure how to interpret, i.e. what are the important aspects of the output that I need to look at.    
Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

Ryan
Let's see if I can help a bit here...
 
In the context of generalized linear models, the dependent variable has a distribution which is normal, poisson, binomial, etc. conditional upon the independent variable(s). Furthermore, there is a link function that specifies how the expected value of "Y" (a.k.a. "E(Y)") and the linear combination of independent variables, "eta" are related, where

eta = b0 + b1*X1 + b2*X2 + ... + bkXk
 
Suppose the distribution of your dependent variable is binomial. In this case, the expected value of "Y" can be defined as:

E(Y) = P(Y=1) = exp(eta) / (1 + exp(eta))

The canonical link for a binomial response is

log(p/(1-p)) = eta

which is known as the logit link, where

p = P(Y=1)

*Note: P(Y=0) could also be modeled.
 
By employing a logit link function, exp(eta) / (1 + exp(eta)) is constrained between 0 and 1, which makes sense since E(Y) is a probability.
 
A generalized linear model with a binomial response and link logit is known as "binomial logistic regression." My guess is that the OP has heard of "binomial logistic regression" previously. Still, I felt it important for the OP to see how one arrives at the name.

Let's continue with a simple example...
 
In binomial logistic regression with only one continuous predictor, X1, the binomial logistic regression equation can be written as:

logit(p) = b0 + b1*X1

By exponentiating the estimated coefficient, b1, one obtains the estimated odds ratio of X1. If b1 were 1.52, for example, then exp(b1) = exp(1.52) = 4.57. Assuming we are modeling P(Y=1), then the following interpretation can be made: With a 1-unit increase in X1, the odds that Y=1 increases by 4.57. If you had more than one independent variable in the model, then the interpretation of the coefficient and exp(coefficient) of X1 would change.
 
My point? The scenario I provided is just one out of many possibilities.To start, the OP should think about the distribution of the dependent variable conditional upon the independent variables, along with the appropriate link function.
 
HTH,
 
Ryan
On Thu, Jul 5, 2012 at 9:12 PM, analyze28 <[hidden email]> wrote:
Hi Matthew,

Thanks for your quick reply, I appreciate it and it has helped to put things
into context in my head.  With regards to the covariates and main effects in
the models, I never thought to question it.  My supervisor informed me that
this was what I needed to do and sent me the models as you have already seen
below.  From clearing my head, the "covariates" will be the factors and
covariates in Model 1.

As for the main effects, I have already centred my variables and sorted my
ordinal rating for the categorical variables so they are in the correct
"direction."  Is it right to presume that I just select "main effects" in
the drop down box and move them across to get this information?

The variable resilience at this moment in time, is being deliberated.  I am
intending to run it as a categorical and as a continuous variable in two
separate analysis to see if this makes a difference in interpretation etc.

I understand the issue with the interactions and am prepared to handle the
haystack in deciphering it all :-)

As to the output of the GZLM, are there any recommended sources for
interpreting the output?  I've searched around but not found anything that
can actually give clear guidance.  Everything tells me how to run it, and
although I have a good understanding of statistics, I've not run a GZLM
before so am not 100% sure how to interpret, i.e. what are the important
aspects of the output that I need to look at.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Generalized-Linear-Model-Moderation-Analysis-tp5714032p5714042.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

analyze28
In reply to this post by analyze28
Hi,

At present I am using binomial logistic regression, as my dependent variable is a yes/no response.  So for this aspect, I believe that I am ok.  I do have a rudimentary understanding of the generalized linear models, and am able to appreciate what the model is trying to do.  It is more interpreting the output that I wanted to be sure I was doing correctly, as I am unable to find an example to compare my results too in order to ascertain whether I have completely gone of the track or not.

With the missing data analysis, I have been reviewing this and have conducted Little's MCAR test.  My results for the three age groups have all shown significance.  I understand that this means that the data is not missing completely at random, however I have also been told that using the multiple imputation method is not advisable, and to do Expectation Maximization.  However, with the results of Little's test, am I right in understanding that this is not possible?
Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

Poes, Matthew Joseph
In reply to this post by analyze28
Generalized Linear Model is the generic term that SPSS has used to describe a set of model types which include ANOVA, ANCOVA, etc. but modeled with various distributional assumptions, links, and more importantly estimation methods.  Interpretation however remains basically the same as that for a typical ANOVA, ANCOVA, Regression, etc.  Remember, all of these models are in the same family, so interpretation of the estimates doesn't change.  There are a set of goodness of fit information criteria stats.  Pay attention to those, loosely.  They are most important when running models as you are.  When you run the model in steps, as you indicated below, you will want to see that the IC's go down each time.  However, note that I have found that this is not always the case, I find in most scenario's the AIC goes down as I add complexity to the model, but sometimes the BIC and CAIC go up.

Keep in mind that categorical variables in the model can complicate interpretation of the coefficients, especially when used in interactions.  Just make sure that all continuous variables have a meaningful zero point, in order to ensure your intercept is meaningfully interpretable, and the categorical variables and interactions will be much easier.

Oh I don't believe I mentioned this, I highly recommend plotting interactions.  With categorical by continuous interactions, I like to take the +/- 1SD estimate for each category, and plot those together on a graph.  It will be interpretation and visualization of the interpretation much easier for everyone.  For any categorical by categorical, just take the point estimate for each value combination.  For continuous by continuous, just take the +/-1 SD point estimate.

I try not to over-interpret the omnibus tests myself, but this may depend greatly on the nature of your collective variables.  I would say that once you have established a significant omnibus test (assuming its in the first model you run), later models will also be significant, so there is no point in interpreting those.  The best thing to do then is to look for ways to quantify the change in explained variance.  Standardized beta's (via standardized coefficients) is the best way to do that, as well as the discussion of improved model fit.

Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
Phone: 217-265-4576
email: [hidden email]


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of analyze28
Sent: Thursday, July 05, 2012 8:12 PM
To: [hidden email]
Subject: Re: Generalized Linear Model - Moderation Analysis

Hi Matthew,

Thanks for your quick reply, I appreciate it and it has helped to put things into context in my head.  With regards to the covariates and main effects in the models, I never thought to question it.  My supervisor informed me that this was what I needed to do and sent me the models as you have already seen below.  From clearing my head, the "covariates" will be the factors and covariates in Model 1.

As for the main effects, I have already centred my variables and sorted my ordinal rating for the categorical variables so they are in the correct "direction."  Is it right to presume that I just select "main effects" in the drop down box and move them across to get this information?

The variable resilience at this moment in time, is being deliberated.  I am intending to run it as a categorical and as a continuous variable in two separate analysis to see if this makes a difference in interpretation etc.

I understand the issue with the interactions and am prepared to handle the haystack in deciphering it all :-)

As to the output of the GZLM, are there any recommended sources for interpreting the output?  I've searched around but not found anything that can actually give clear guidance.  Everything tells me how to run it, and although I have a good understanding of statistics, I've not run a GZLM before so am not 100% sure how to interpret, i.e. what are the important aspects of the output that I need to look at.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Generalized-Linear-Model-Moderation-Analysis-tp5714032p5714042.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

Ryan
In reply to this post by analyze28
You state: "It is more interpreting the output that I wanted to be sure I was doing correctly, as I am unable to find an example to compare my results too in order to ascertain whether I have completely gone of the track or not."
It isn't clear to me what you're asking. Are you asking how to intepret the fit indices? The omnibus tests? The Classification Table? Are you asking how to interpret the regression coefficients or the tests associated with the coefficients? There's plenty of output that SPSS produces after fitting a logistic regression model. Without asking specific questions, it is not possible for me to help much beyond referring you to either a textbook that covers logistic regression or perhaps a reputable website such as this one:
 
 
Ryan
On Fri, Jul 6, 2012 at 3:01 AM, analyze28 <[hidden email]> wrote:
Hi,

At present I am using binomial logistic regression, as my dependent variable
is a yes/no response.  So for this aspect, I believe that I am ok.  I do
have a rudimentary understanding of the generalized linear models, and am
able to appreciate what the model is trying to do.  It is more interpreting
the output that I wanted to be sure I was doing correctly, as I am unable to
find an example to compare my results too in order to ascertain whether I
have completely gone of the track or not.

With the missing data analysis, I have been reviewing this and have
conducted Little's MCAR test.  My results for the three age groups have all
shown significance.  I understand that this means that the data is not
missing completely at random, however I have also been told that using the
multiple imputation method is not advisable, and to do Expectation
Maximization.  However, with the results of Little's test, am I right in
understanding that this is not possible?

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Generalized-Linear-Model-Moderation-Analysis-tp5714032p5714045.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Generalized Linear Model - Moderation Analysis

analyze28
Hi,

Many thanks for that Matthew and Ryan. That has helped to clear up in my head how to go about this as I now have a better understanding. The link also helped. It seems that I was searching with the wrong keywords to get an example of what I needed. The plotting of interactions is also genius in giving clear comprehension, thank you for that tip!

The other fun that I have been having is with my EM on my missing data.  I have run Little's MCAR and my results rejected the null hypothesis. I conducted EM to see what would occur but the data still was significant. Am I correct in assuming that my only alternative is to now run multiple imputations? My data analysis is cross sectional.