I am using a repeated measures, negative binomial generalized estimating equation (GEE) to analyze my study where I counted animals on forest transects that sample 2 habitats. The habitats were not evenly sampled (it wasn't possible) so I have an offset variable of area sampled.
This works fine for the GEE, but the estimated marginal mean contrast seems not to incorporate the offset (area sampled) because it appears to test for differences in the average number of individuals sighted per transect with no consideration of area sampled. When I convert the average number of individuals to a density (EMM of average number of individuals/average area sampled) the relationship between the two variables often reverses (the smaller EMM becomes larger when expressed in terms of individuals/area). Is there a way to incorporate an offset variable into the SPSS GEE EMM statement? Or am I restricted to reporting the GEE results and the simple (EMMs/area) without doing a pairwise contrasts between them? Thanks. |
Are you looking for the OFFSET keyword
on the MODEL subcommand?
http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/topic/com.ibm.spss.statistics.help/syn_genlin_model.htm Alex |
I am currently away from the office and have limited access to email. I will return to the office on Wednesday, June 27th. |
In reply to this post by Alex Reutter
Offset wouldn’t be area specific, it’s a constant across everything. Basically like shifting the base or intercept value. Matthew J Poes Research Data Specialist Center for Prevention Research and Development University of Illinois 510 Devonshire Dr. Champaign, IL 61820 Phone: 217-265-4576 email:
[hidden email] From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Alex Reutter Are you looking for the OFFSET keyword on the MODEL subcommand?
|
In GENLIN, you can fix OFFSET at a number
that's constant across everything, *or* you can specify a variable that
has area-specific offsets.
Alex |
Thank you for the responses.
I have a separate variable for the offset as it varies on a sample by sample basis. I included it this way: /MODEL Habitat INTERCEPT=YES OFFSET=LOGareasampled The parameter Habitat is significant in the model effects and the EMMs output then gives me a significant difference with: Habitat A: 0.54 (SE 0.071) Habitat B: 016 (SE 0.067) Means are reported for the response, so this is the average number of individuals per transect replicate in each habitat - unless I am missing something. The problem is that A and B were not evenly sampled, so when I divide the average number of individuals per habitat above by the average area sampled per habitat to get individuals/square kilometer I end up with: Habitat A: 6.30 individuals/km2 Habitat B: 11.01 individuals/km2 In effect the relationship between the variables has been flipped. This makes me think that the EMMs test is not accounting for the offset. Any ideas? I am stuck. Thanks again. |
In reply to this post by titopuente
I can't say I really understand your study, but the general question as to whether the estimated means from the EMMEANS sub-command of the GENLIN procedure take into account an offset is worth exploring. Consequently, I decided to generate data from a negative binomial regression with a single categorical (dichotomous) predictor and an offset, along with a random intercept to incorporate within-subject correlation. I then fit a GEE negative binomial model using the GENLIN procedure on the simulated data. What did I find? The EMMEANS sub-command does in fact take into account the offset.
For those interested, the SPSS syntax used to perform this simulation experiment is provided below. Ryan -- *Generate Data. set seed 895795432. new file. inp pro. comp ID=-99. comp mean_habitat1 = -99. comp mean_habitat2 = -99. comp habitat = -99. comp offset_1=-99. comp rand_eff = -99. leave ID to rand_eff. loop ID= 1 to 100000. comp rand_eff = sqrt(0.025)*rv.normal(0,1). loop time = 1 to 2. comp mean_habitat1 = 10.0. comp mean_habitat2 = 20.0. comp b0 = ln(mean_habitat2). comp b1 = ln(mean_habitat1) - ln(mean_habitat2). comp habitat = rv.bernoulli(0.50). comp offset_1=rnd(rv.uniform(10,100)). comp ln_offset_1 = ln(offset_1). comp lambda = exp(b0 + b1*(habitat=0) + ln_offset_1 + rand_eff). comp shape = 0.8. comp dispersion = 1 / shape. comp scale = lambda / shape. comp mean = rv.gamma(shape, 1/scale). comp y = rv.poisson(mean). end case. end loop. end loop. end file. end inp pro. exe. delete variables mean_habitat1 mean_habitat2 b0 b1 lambda shape scale dispersion mean b0 b1 rand_eff. * Generalized Estimating Equations. GENLIN y BY habitat (ORDER=ASCENDING) /MODEL habitat INTERCEPT=YES OFFSET=ln_offset_1 DISTRIBUTION=NEGBIN(MLE) LINK=LOG /EMMEANS TABLES=habitat SCALE=ORIGINAL COMPARE=habitat CONTRAST=PAIRWISE PADJUST=LSD /REPEATED SUBJECT=ID WITHINSUBJECT=time SORT=YES CORRTYPE=EXCHANGEABLE ADJUSTCORR=YES COVB=ROBUST /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED). COMPUTE y_offset=y/offset_1. EXECUTE. EXAMINE VARIABLES=y_offset BY habitat /PLOT NONE /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. On Tue, Jun 26, 2012 at 11:21 AM, titopuente <[hidden email]> wrote:
> > I am using a repeated measures, negative binomial generalized estimating > equation (GEE) to analyze my study where I counted animals on forest > transects that sample 2 habitats. The habitats were not evenly sampled (it
> wasn't possible) so I have an offset variable of area sampled. > > This works fine for the GEE, but the estimated marginal mean contrast seems > not to incorporate the offset (area sampled) because it appears to test for
> differences in the average number of individuals sighted per transect with > no consideration of area sampled. When I convert the average number of > individuals to a density (EMM of average number of individuals/average area
> sampled) the relationship between the two variables often reverses (the > smaller EMM becomes larger when expressed in terms of individuals/area). > > Is there a way to incorporate an offset variable into the SPSS GEE EMM
> statement? Or am I restricted to reporting the GEE results and the simple > (EMMs/area) without doing a pairwise contrasts between them? > Thanks. > >
> -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Offset-in-GEE-means-contrasts-tp5713799.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD |
Ryan,
Thanks for running a test to verify that it works properly. I have solved my problem. My original data are in Excel, where I did my log transformations. I used "lg()" to transform the data, mistakenly thinking that this returned the natural log (which is "ln()") when the "lg()" code actually returns the log 10 in Excel. As I imagine most of you are already aware, for the offset to work it must be the natural log of the independent variable used as the offset (in this case area sampled to give me # individuals/square kilometer). Thanks again to everyone who read my post and especially to those who replied. |
As general guideline, one should first estimate the coefficient of ln(offset); that is, you should enter ln(offset) as a covariate first. If the coefficient of ln(offset) is near 1.0 (taking into account the standard error) AND it makes intuitive sense to treat it as an offset, then it is probably safe to treat it as such. Put another way, by fixing the coefficient to 1.0, you are assuming that the response is directly proportional to the offset (McCullagh & Nelder, 1989).
Ryan REFERENCE: McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. London: Chapman and Hall. On Fri, Jun 29, 2012 at 12:43 PM, titopuente <[hidden email]> wrote:
> > Ryan, > > Thanks for running a test to verify that it works properly. > > I have solved my problem. My original data are in Excel, where I did my log
> transformations. I used "lg()" to transform the data, mistakenly thinking > that this returned the natural log (which is "ln()") when the "lg()" code > actually returns the log 10 in Excel. As I imagine most of you are already
> aware, for the offset to work it must be the natural log of the independent > variable used as the offset (in this case area sampled to give me # > individuals/square kilometer).
> > Thanks again to everyone who read my post and especially to those who > replied. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Offset-in-GEE-means-contrasts-tp5713799p5713907.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD |
Free forum by Nabble | Edit this page |