I am interested in testing for change in proportion of a sample reporting one binary outcome, across four measurement occasions. There is no grouping independent variable - the only IV is change over time. Searching has thus far suggested that I am likely to need some form of Generalized Estimating Equation model, but all of the examples I see include a grouping factor as well as the repeated measures factor. Suggestions welcome.
Thank you, George Tremblay Antioch University New England |
Administrator
|
Have you tried adapting one of the examples you've found, including removal of any between-Ss variables? Does something like the following do (more or less) what you want?
* Generate a small data set to illustrate. NEW FILE. DATASET CLOSE all. DATA LIST list / Y1 to Y4 (4F1). BEGIN DATA 1 1 1 1 0 0 0 0 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 END DATA. DESCRIPTIVES Y1 to Y4. * With 1-0 variables, MEAN = proportion equal to 1. * GENLIN wants a long file for repeated measures. VARSTOCASES /ID=id /MAKE Y FROM Y1 Y2 Y3 Y4 /INDEX=Time(4) /KEEP= /NULL=KEEP. GENLIN Y (REFERENCE=FIRST) BY Time (ORDER = DESCENDING) /MODEL TIME INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT /REPEATED SUBJECT=id WITHINSUBJECT=Time SORT=YES CORRTYPE=unstructured ADJUSTCORR=YES COVB=ROBUST MAXITERATIONS=100 PCONVERGE=1e-006(ABSOLUTE) UPDATECORR=1 /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED) /SAVE MeanPred. * Predicted probabilities from this model should match * proportions = 1 for each time point, I think. MEANS MeanPredicted by Time. Note that I treated time as a categorical variable. You may be able to treat it as continuous, depending on how it was measured (so WITH Time rather than BY Time). Perhaps this will help get you started.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thanks very much, Bruce. Your note that mean=proportion equal to 1 for binary (1 vs. 0) variables strikes me as brilliant! An additional wrinkle is that I do not expect to be able to match cases across measurement occasions - I'll have N observations (1 vs. 0 counts) at each occasion, but the data is being collected by observers who do not know and therefore cannot identify the individuals involved. To illustrate with one scenario, observers will note how many students at a particular school are arriving on foot or bike, versus those arriving in a vehicle. The counting is feasible (these are small schools), but identifying the kids involved is not. So I'll have only aggregate proportions, rather than the case-level data matrix you envisioned. On Sat, Mar 28, 2015 at 3:21 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote: Have you tried adapting one of the examples you've found, including removal of any between-Ss variables? Does something like the following do (more or less) what you want? |
Administrator
|
George, I don't have time to respond right now. But I see that Nabble still shows "This post has NOT been accepted by the mailing list yet" for both of your posts.* So I am just responding now to try to ensure that all list members see your second post (in case others have time to jump in with suggestions).
* Have you subscribed to the actual mailing list? If not, you can do so via this link: http://spssx-discussion.1045642.n5.nabble.com/mailing_list/MailingListOptions.jtp?forum=1068821
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Well...I don't have a LOT of time, at least. You still have repeated observations from the same school, which means they are not independent (even if the observers are different on different occasions). I think the main thing you are getting at is that the data are of the X of N form--i.e., for each school and time point, you have X (the numerator) and N (the denominator). So perhaps you need to do something similar to what is shown in this (very off the cuff and not well tested) syntax:
NEW FILE. DATASET CLOSE all. DATA LIST list / School Time X N (4F5.0). BEGIN DATA 1 1 25 100 1 2 30 100 1 3 21 99 1 4 20 97 2 1 50 200 2 2 55 198 2 3 68 201 2 4 59 200 3 1 10 50 3 2 12 54 3 3 14 53 3 4 11 49 END DATA. GENLIN X of N BY Time (ORDER = DESCENDING) /MODEL TIME INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT /REPEATED SUBJECT=School WITHINSUBJECT=Time SORT=YES CORRTYPE=unstructured ADJUSTCORR=YES COVB=ROBUST MAXITERATIONS=100 PCONVERGE=1e-006(ABSOLUTE) UPDATECORR=1 /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED) . The Exp(B) values in the table of coefficients give odds ratios for Times 4, 3 and 2 relative to Time 1. HTH. p.s. - When using X of N like this, I would call it a binomial (rather than binary) logistic regression.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thank you again, Bruce - this is super helpful. You are exactly right, concerning the form of data I will have, and this syntax helps me envision a suitable framework for the analysis. Very gracious of you, especially given your time pressures.On Sun, Mar 29, 2015 at 12:15 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote: Well...I don't have a LOT of time, at least. You still have repeated observations from the same school, which means they are not independent (even if the observers are different on different occasions). I think the main thing you are getting at is that the data are of the X of N form--i.e., for each school and time point, you have X (the numerator) and N (the denominator). So perhaps you need to do something similar to what is shown in this (very off the cuff and not well tested) syntax: |
Free forum by Nabble | Edit this page |