Hi,
I have two samples to compare, one is smaller and let's say clinical, the other one is 4 times bigger and let's say non-clinical. they differ on several aspects such as gender sexual orientation and age. Does anybody know how to resample the bigger one based on the smaller sample's gender sexual orientation and age characteristics? Thanx! Andrea |
Although you do not say, let's suppose you wanted a 1 to 1 sample of the larger sample. A key issue is whether age is coded as ranges, e.g., 18-21, 22-28, etc or integer age, e.g., 23, 24, 26, etc. If age is integer, then there is a chance, and maybe a good chance, of not finding any matching cases.
I think this is how I'd do this. Let Smaller be the smaller sample; Larger be the larger sample. I assume no missing data on either age or gender in either file. I assume that both Larger and smaller contain only two variables: gender and age. I also haven't tried this out. Get file='Smaller'. Aggregate outfile='AggSmaller'/break=gender age/nn=nu. Get file='Larger'. Compute seq=rv.uniform(0 1). Sort cases by gender age seq. Do if ($casenum eq 1). + compute rec=1. Else if (gender ne lag(gender) or age ne lag(age)). + compute rec=1. Else. + compute rec=lag(rec)+1. End if. Match files file=*/table='AggSmaller'/by gender age. Select if (rec le nn). Execute. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of andrea1977 Sent: Wednesday, March 06, 2013 3:31 AM To: [hidden email] Subject: post stratification based on statistic criteria Hi, I have two samples to compare, one is smaller and let's say clinical, the other one is 4 times bigger and let's say non-clinical. they differ on several aspects such as gender sexual orientation and age. Does anybody know how to resample the bigger one based on the smaller sample's gender sexual orientation and age characteristics? Thanx! Andrea -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/post-stratification-based-on-statistic-criteria-tp5718402.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by andrea1977
This time, I'll be the one to throw in the observation that it is generally
a bad idea to discard data. The exception to this might be when you ought to match the observed ranges. Thus, if one sample has no cases outside the ages of (say) 20 to 40, it could be appropriate to define both samples on that basis... especially if you expect nonlinear age effects outside of that range. (On the other hand, it *might* still be useful to keep the whole range, to get a more precise determination of the simple, linear age regression. One answer does not fit all data.) And if you have only a few transgendered subjects in the smaller, clinical sample, you get your more powerful comparison by using as many from the larger, non-clinical sample as possible. "Equal N" is a useful *starting* point for *designing* data collection of some sorts, since that gives best power for comparisons where the variances are equal in the two groups. But even for those "paired designs," it is often a better form of data analysis to use regression or covariance procedures to do the actual testing, instead of using any form of paired testing. -- Rich Ulrich > Date: Wed, 6 Mar 2013 00:30:30 -0800 > From: [hidden email] > Subject: post stratification based on statistic criteria > To: [hidden email] > > Hi, > > I have two samples to compare, one is smaller and let's say clinical, the > other one is 4 times bigger and let's say non-clinical. they differ on > several aspects such as gender sexual orientation and age. > Does anybody know how to resample the bigger one based on the smaller > sample's gender sexual orientation and age characteristics? > > Thanx! >... |
Free forum by Nabble | Edit this page |