Dear more experienced than me SPSS users,
I'm currently working on a research project in which I've hit a wall. I have three groups which I want to compare on a number of analysis. One of these groups is much too large compared to the other groups. This much too large group is a control group. The other two groups are comprised of a rare demographic. As of such, I want to drop my control group of 65 down to 25-30 while matching them on certain demographic variables (namely age, IQ and education). The other two groups don't differ significantly from one another on the mean of these variables. My control group does differ sognificantly. I want my control group sample to be within the same range of the means of my other two groups. Obviously (otherwise I wouldn't be posting), I am completely lost as to how to do this. I assume I'll be using the syntax, but I am unsure what commands I should be using. I've searched around and found some similar cases as mine, except they wanted to take a random sample based on propensity scores (which I don't believe works for me) or they wanted to take a sample out of two groups (while I want a sample out of one). If anyone could assist me, I would be very grateful. Thanks for your time! |
Administrator
|
From what you've said, it sounds as if N=65 for control versus 25-30 for the other two groups. That is not what I would describe as a large discrepancy in sample sizes.
What type of study design is it? (E.g., see http://www.med.uottawa.ca/sim/data/Study_Designs_e.htm.) What kinds of outcome (dependent) variables do you have, and what types of models are you using?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
It's an observational study with three research questions pertaining to the ability of implicit measures in diagnosing two different groups of pedophiles and the ability of these implicit measures to predict risk of recidivism. You're right that it's not a huge difference in N, but the demographic difference warrants action.
The sample I want to pull from the control group is to be used in 3 analysis: a 1-factor ANOVA, a repeated measures ANOVA and a ROC analysis. I hope this is enough information. |
In reply to this post by Koen N
Koen,
In this situation I would rather weight on the propensity score than match on these (the propensitie can be obtained by logistic regression). In that approach all data in the control group could be included, while having the demographic frequencies characteristics of the (combined) intervention groups. For turning this problem into a script, frequency distributions of the demographics in the three groups would be needed. If this will work depends on the discrepancies between the groups. Regards, Paul Oosterveld. On Sun, 11 Nov 2012 05:55:17 -0800, Bruce Weaver <[hidden email]> wrote: >From what you've said, it /sounds/ as if N=65 for control versus 25-30 for >the other two groups. That is not what I would describe as a large >discrepancy in sample sizes. > >What type of study design is it? (E.g., see >http://www.med.uottawa.ca/sim/data/Study_Designs_e.htm.) > >What kinds of outcome (dependent) variables do you have, and what types of >models are you using? > > > >Koen N wrote >> Dear more experienced than me SPSS users, >> >> I'm currently working on a research project in which I've hit a wall. I >> have three groups which I want to compare on a number of analysis. One of >> these groups is much too large compared to the other groups. This much >> large group is a control group. The other two groups are comprised of a >> rare demographic. As of such, I want to drop my control group of 65 down >> to 25-30 while matching them on certain demographic variables (namely age, >> IQ and education). The other two groups don't differ significantly from >> one another on the mean of these variables. My control group does differ >> sognificantly. I want my control group sample to be within the same range >> of the means of my other two groups. >> >> Obviously (otherwise I wouldn't be posting), I am completely lost as to >> how to do this. I assume I'll be using the syntax, but I am unsure what >> commands I should be using. I've searched around and found some similar >> cases as mine, except they wanted to take a random sample based on >> propensity scores (which I don't believe works for me) or they wanted to >> take a sample out of two groups (while I want a sample out of one). >> >> If anyone could assist me, I would be very grateful. Thanks for your > > > > > >----- >-- >Bruce Weaver >[hidden email] >http://sites.google.com/a/lakeheadu.ca/bweaver/ > >"When all else fails, RTFM." > >NOTE: My Hotmail account is not monitored regularly. >To send me an e-mail, please use the address shown above. > >-- >View this message in context: http://spssx- tp5716152p5716153.html >Sent from the SPSSX Discussion mailing list archive at Nabble.com. > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Koen N
Using propensity scores rather than matching
on specific demographics is likely to produce less biased results. For
a start check this:
http://en.wikipedia.org/wiki/Propensity_score_matching From: Koen N <[hidden email]> To: [hidden email] Date: 11/12/2012 09:58 AM Subject: Random sample matched on variables Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear more experienced than me SPSS users, I'm currently working on a research project in which I've hit a wall. I have three groups which I want to compare on a number of analysis. One of these groups is much too large compared to the other groups. This much too large group is a control group. The other two groups are comprised of a rare demographic. As of such, I want to drop my control group of 65 down to 25-30 while matching them on certain demographic variables (namely age, IQ and education). The other two groups don't differ significantly from one another on the mean of these variables. My control group does differ sognificantly. I want my control group sample to be within the same range of the means of my other two groups. Obviously (otherwise I wouldn't be posting), I am completely lost as to how to do this. I assume I'll be using the syntax, but I am unsure what commands I should be using. I've searched around and found some similar cases as mine, except they wanted to take a random sample based on propensity scores (which I don't believe works for me) or they wanted to take a sample out of two groups (while I want a sample out of one). If anyone could assist me, I would be very grateful. Thanks for your time! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Random-sample-matched-on-variables-tp5716152.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Both Paul and Martha,
thank you for your considerations. I hadn't delved too far into propensity scores yet but from what I could tell earlier it was only to be used when there's a 'treatment' involved. But if I understand correctly, the 'treatment' can be anything that seperates two groups (which in this case would be paedophilia). As such it could be defined as "the probability of being a pedophile based on measured covariates". Am I correct in this understanding? If I understand correctly I am to run a logistic regression in which I use my group variable (which I first transform into having two groups instead of three, which would be theoretically fine) as the dependent variable and my selected covariates as predictors. I save my predicted values. After this the actual matching begins and most sense would be to use the nearest neighbour matching. It seems as if nearest neighbour matching is going to be quite some work (manually finding the propensity score that is nearest to my pedophilic group amongst the control group for every pedophilic participant). Although I think I can cut down on a lot of time on that by using some simple commands. Is my understanding correct or am I mistaking on any of the steps I should take? Thanks again everyone! |
I haven't been following this thread but this particular post jogged my memory; I came across an interesting SUGI paper a while back that you might find useful: I have seen other, more sophisticated approaches as well. Again, not sure if this is related to what you're discussing. If not, feel free to disregard. Ryan
|
Free forum by Nabble | Edit this page |