Hi!
I want to know whether it makes sense to use PSM to estimate an intervention *given* the variables I have available to model the probability of receiving the treatment. The intervention is a program intended to increase college-going success, similar to an Advanced Placement course. I have data for gender, race, SES, special ed and limited english proficiency designations, and math and reading achievement scores before the treatment. I also have data for school, school district, and urbanicity. Data for outcome variables of interest include enrollment in college and college GPA. One goal of PSM is to find a matched, control group that can be compared to the treatment group. I have read that the best way to do this is to have variables that are associated with the selection criteria for the treatment and the outcomes of interest. In my case, while prior achievement scores are likely to predict the outcome, I do not know the selection criteria and therefore I don't know the relationship between the available data and the selection criteria. The reason some students find themselves in the treatment is unknown, and might or might not be related to the variables I have available. I believe that matching on the available demographics and achievement scores is better than not matching. But I don't know whether it makes sense to match based on variables whose relationship to the selection criteria are unknown. Put another way, imagine I had data on an ice cream consumption and that this consumption is not associated with receiving the treatment aimed to improve college success. Also imagine, albeit a bit far-fetched, that consuming ice cream is associated with the outcome variables (maybe sugar increases college success?). Would it still make sense to match on ice cream consumption even though it is not related to the selection process for receiving the treatment? Any and all ideas would be appreciated! |
It sounds like you may not have been informed as to whether the data are a) for those students who were selected to receive an offer of participation, some of whom subsequently elected to participate, or b) for all students, some of whom received a participation offer. Apparently you have a variable recording which students participated in the intervention. You could run model for intervention participation. If a) applies, I'd expect that the resulting comparison group would be (much) more comparable to intervention group than if b) applies. Comparable, meaning that the two groups differ on few or none of the available covariates.
Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of dcaudio Sent: Wednesday, April 15, 2015 9:36 PM To: [hidden email] Subject: Ideal variables to estimate propensity scores Hi! I want to know whether it makes sense to use PSM to estimate an intervention *given* the variables I have available to model the probability of receiving the treatment. The intervention is a program intended to increase college-going success, similar to an Advanced Placement course. I have data for gender, race, SES, special ed and limited english proficiency designations, and math and reading achievement scores before the treatment. I also have data for school, school district, and urbanicity. Data for outcome variables of interest include enrollment in college and college GPA. One goal of PSM is to find a matched, control group that can be compared to the treatment group. I have read that the best way to do this is to have variables that are associated with the selection criteria for the treatment and the outcomes of interest. In my case, while prior achievement scores are likely to predict the outcome, I do not know the selection criteria and therefore I don't know the relationship between the available data and the selection criteria. The reason some students find themselves in the treatment is unknown, and might or might not be related to the variables I have available. I believe that matching on the available demographics and achievement scores is better than not matching. But I don't know whether it makes sense to match based on variables whose relationship to the selection criteria are unknown. Put another way, imagine I had data on an ice cream consumption and that this consumption is not associated with receiving the treatment aimed to improve college success. Also imagine, albeit a bit far-fetched, that consuming ice cream is associated with the outcome variables (maybe sugar increases college success?). Would it still make sense to match on ice cream consumption even though it is not related to the selection process for receiving the treatment? Any and all ideas would be appreciated! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Ideal-variables-to-estimate-propensity-scores-tp5729234.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Hi Maguin, Thanks for your response. A couple followup thoughts: I do not know whether particular participants received an offer. Nor do I know whether there were criteria for making that offer (e.g., above a test score threshold, GPA, or an application process). Maybe all students were made offers. I just don't know whether your A or B was indeed the case. I can see that A would somewhat more randomize participation and therefore participants and non-participants would be more similar than in the B scenario. But is it possible to look at the observed data to see which scenario was more likely? I also presume you mean comparing covariates before matching on propensity scores would give a (albeit loose) indication of selection criteria, right? On Thu, Apr 16, 2015 at 9:30 AM, Maguin, Eugene [via SPSSX Discussion] <[hidden email]> wrote: It sounds like you may not have been informed as to whether the data are a) for those students who were selected to receive an offer of participation, some of whom subsequently elected to participate, or b) for all students, some of whom received a participation offer. Apparently you have a variable recording which students participated in the intervention. You could run model for intervention participation. If a) applies, I'd expect that the resulting comparison group would be (much) more comparable to intervention group than if b) applies. Comparable, meaning that the two groups differ on few or none of the available covariates. |
>>… possible to look at the observed data to see which scenario was more likely?
If all you have are the data elements you listed and the intervention status variable scored as “in” vs. “not in”, I think the answer is Probably Yes. You could do t-tests and crosstabs to compare the two groups across the variable set.
You could do a logistic regression for multivariate look (notice that this is PSM step 1.) If there were no or small between-group differences, I’d guess that the data were for a selected sample. If not, then probably a total student sample.
>>I also presume you mean comparing covariates before matching on propensity scores would give a (albeit loose) indication of selection criteria, right? As I understand the information and data that you have, you can either compare treated to all not-treated students or do some sort of matching method, of which
PSM is the leading candidate method. The better the prediction equation, the more similar the treated group will be to the matched comparison group on the covariates.
Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of dcaudio Hi Maguin, Thanks for your response. A couple followup thoughts: I do not know whether particular participants received an offer. Nor do I know whether there were criteria for making that offer (e.g., above a test score threshold, GPA, or an application process). Maybe all students were made offers.
I just don't know whether your A or B was indeed the case. I can see that A would somewhat more randomize participation and therefore participants and non-participants would be more similar than in the B scenario. But is it possible to look at the observed
data to see which scenario was more likely? I also presume you mean comparing covariates
before matching on propensity scores would give a (albeit loose) indication of selection criteria, right? On Thu, Apr 16, 2015 at 9:30 AM, Maguin, Eugene [via SPSSX Discussion] <[hidden email]> wrote:
View this message in context:
Re: Ideal variables to estimate propensity scores |
Hi again Gene, Yes, I plan to use PSM to compare those treated to a matched group. My basic question from the original thread was, what makes for ideal or good variables in the equation to predict the comparison group? Thanks again for your help. On Thu, Apr 16, 2015 at 1:00 PM, Maguin, Eugene [via SPSSX Discussion] <[hidden email]> wrote:
|
I encourage you to do some background reading. One recent book is by Guo ShenYang. There are others, as well as articles. PSM is an evolving area and PSM has
quite a few moving parts. And there are software decisions to consider. That said, more variables are better than fewer because you are not trying to find the best set of predictors but, rather, the best matches. The project is particularly
difficult because you don’t know anything about the design of the project, particularly, whether there were eligibility criteria, whether all or only some eligible students were offered participation. Furthermore, and like most, maybe all, PSM analyses, you
don’t know anything about the participation decision factors process itself. Every set of decisions you make about variables to include, whether to include interactions, how to do the matching, whether to include propensity score as a predictor in the analysis,
etc, defines A result, not THE result. Consistency in results across defensible variation in the PSM process would increase confidence in your results. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of dcaudio Hi again Gene, Yes, I plan to use PSM to compare those treated to a matched group. My basic question from the original thread was, what makes for ideal or good variables in the equation to predict the comparison group?
Thanks again for your help. On Thu, Apr 16, 2015 at 1:00 PM, Maguin, Eugene [via SPSSX Discussion] <[hidden email]> wrote:
View this message in context:
Re: Ideal variables to estimate propensity scores |
Free forum by Nabble | Edit this page |