I've used spss logistic regression extensively. But, I have a data set
with a dichotomous outcome (labtest result) and some individual patient risk factors that could be predictors. I have about 100+ clinic sites and 30K test result records. Given this is a sample of clinics I believed the correct approach would be to use genlinmixed and include a random effect for clinic--instead of logistic reg. I have 2 questions--and will start with the simpler one: Should I be using genlinmixed and specifying clinic as random effect, as follows (for looking at race/ethnicity (nominal) and test result? Or should i be using a different procedure? GENLINMIXED /DATA_STRUCTURE SUBJECTS=clinic*patientID /FIELDS TARGET=testresult TRIALS=NONE OFFSET=NONE /TARGET_OPTIONS DISTRIBUTION=BINOMIAL LINK=LOGIT /FIXED EFFECTS= racethnicity USE_INTERCEPT=TRUE /RANDOM USE_INTERCEPT=TRUE SUBJECTS=clinic COVARIANCE_TYPE=VARIANCE_COMPONENTS /BUILD_OPTIONS TARGET_CATEGORY_ORDER=DESCENDING INPUTS_CATEGORY_ORDER=DESCENDING MAX_ITERATIONS=100 CONFIDENCE_LEVEL=95 DF_METHOD=RESIDUAL COVB= MODEL /EMMEANS_OPTIONS SCALE=ORIGINAL PADJUST=LSD. Here's the 2nd question. The data set and analysis is actually a bit more complicated. This is a multi-level analysis where, besides patient level measures, I have some clinic characteristics as well as some areal measures for each patiet based on their residence's zip code. So when doing bivariate analyses to assess impact of patient or clinic-level characteristics I have included a random effect for clinic. When looking at areal (SES, pop density etc.) measures in bivariate analyses I have used ZIPcode as the random effect. I believe this is the correct way to go (but welcome feedback). But, now that I want to do multivariate runs looking at individual, clinic and areal predictors I do not see how I can include 2 random effect measures (clinic and zip) using genlinmixed. Should I be using a different spss procedure or what am I missing in terms of syntax within genlinmixed? Thanks dave PS--this is my first post (and a long one). If I've done somethhing wrong in terms of posting, please lemme know and i will modify re: future posts. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
David, Generalized linear mixed model procedures, such as GENLINMIXED, are designed to model data which come from a family of exponential distributions, conditional upon normally distributed random effects. Your data appear to fall under that umbrella. Admittedly, some of your terminology (e.g., where you use the terms bivariate and multivariate) is a bit confusing to me, but I will simply chalk it up to semantics.
I see merit in treating both clinic and patient zip code as random effects. Whether it is possible to do so using the GENLINMXED procedure is something you will have to test. (I would use the GLIMMIX procedure in SAS for this type of analysis due to its ability to employ a superior estimation method, and if too computationally intensive, I would likely use the MCMC procedure in SAS.) If permitted in GENLINMIXED, you should add RANDOM statement where the subject is "zipcode."
Side note: Since I can't see your dataset, let me just say that you want your subject identification variable specification to result in each subject having a unique identifier. For example, you don't want GENLINMIXED to treat the first person in the first zipcode to be treated as the same person who happens to be the first person in second zipcode. I don't think that will happen based on your subject*clinic subject identification specification, but just make sure that is true. If I were you, I would just create a unique identifier for each subject that starts at 1 and ends at N number of subjects.
With the inclusion of the second RANDOM statement, it would be acceptable to add fixed effects predictors at the patient-level, clinic-level, and now patient zipcode-level.
Good luck. Ryan On Sat, Apr 6, 2013 at 1:05 PM, David Fine <[hidden email]> wrote: I've used spss logistic regression extensively. But, I have a data set |
It just hit me...When speaking about unique identification, I was actually referring to the possibility of 2 or more subjects in the same zipcode inadvertently having the same ID #. Bottom line is to make sure subjects are not being mixed up. Best, Ryan Sent from my iPhone
|
I am currently out of the office and will return Jan 3, 2013. Please e-mail Elizabeth McGaha at [hidden email] or Alicia Filson at [hidden email]
if you need assistance. |
In reply to this post by David Fine
Yeah, multilevel analysis with
Level 1: Y = b0 + b1*race + e Level 2: b0 = g0 + ??? ... ?? + r Is the clinic id specific to the patient across clinics or might (enough to matter) patients have gone to multiple clinics, thereby acquiring multiple patient ids? Would you know if they did? Race, coded as percent of clinic patients, could be a level 2 predictor. You could evaluate whether the level 1 b1 coefficient is fixed or random, i.e., odds ratio of race and test results varies across clinics. Zipcodes makes for a much more complicated model, I think, because clinics might well serve patients from multiple zipcodes and patients from given zipcode share the zipcode level data. I'm curious: what is (are) the question(s) you want to answer using this dataset? Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Fine Sent: Saturday, April 06, 2013 1:06 PM To: [hidden email] Subject: Genlinmixed I've used spss logistic regression extensively. But, I have a data set with a dichotomous outcome (labtest result) and some individual patient risk factors that could be predictors. I have about 100+ clinic sites and 30K test result records. Given this is a sample of clinics I believed the correct approach would be to use genlinmixed and include a random effect for clinic--instead of logistic reg. I have 2 questions--and will start with the simpler one: Should I be using genlinmixed and specifying clinic as random effect, as follows (for looking at race/ethnicity (nominal) and test result? Or should i be using a different procedure? GENLINMIXED /DATA_STRUCTURE SUBJECTS=clinic*patientID /FIELDS TARGET=testresult TRIALS=NONE OFFSET=NONE /TARGET_OPTIONS DISTRIBUTION=BINOMIAL LINK=LOGIT /FIXED EFFECTS= racethnicity USE_INTERCEPT=TRUE /RANDOM USE_INTERCEPT=TRUE SUBJECTS=clinic COVARIANCE_TYPE=VARIANCE_COMPONENTS /BUILD_OPTIONS TARGET_CATEGORY_ORDER=DESCENDING INPUTS_CATEGORY_ORDER=DESCENDING MAX_ITERATIONS=100 CONFIDENCE_LEVEL=95 DF_METHOD=RESIDUAL COVB= MODEL /EMMEANS_OPTIONS SCALE=ORIGINAL PADJUST=LSD. Here's the 2nd question. The data set and analysis is actually a bit more complicated. This is a multi-level analysis where, besides patient level measures, I have some clinic characteristics as well as some areal measures for each patiet based on their residence's zip code. So when doing bivariate analyses to assess impact of patient or clinic-level characteristics I have included a random effect for clinic. When looking at areal (SES, pop density etc.) measures in bivariate analyses I have used ZIPcode as the random effect. I believe this is the correct way to go (but welcome feedback). But, now that I want to do multivariate runs looking at individual, clinic and areal predictors I do not see how I can include 2 random effect measures (clinic and zip) using genlinmixed. Should I be using a different spss procedure or what am I missing in terms of syntax within genlinmixed? Thanks dave PS--this is my first post (and a long one). If I've done somethhing wrong in terms of posting, please lemme know and i will modify re: future posts. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Gene et al., Ans: it depends. Some agencies with multiple clinics use a unique patient ID across sites. But, in general we cannot determine patient mobility. The record of interest in these data is an STD test (chlamydia)--rather
than a patient-level record. Also, in general there are relatively few patients with multiple test visits per year (based on some other work).
Yes. similarly their are individual and clinic-level analogues for a few other measures, e.g. age |
Free forum by Nabble | Edit this page |