post stratification based on statistic criteria

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

post stratification based on statistic criteria

andrea1977
Hi,

I have two samples to compare, one is smaller and let's say clinical, the other one is 4 times bigger and let's say non-clinical. they differ on several aspects such as gender sexual orientation and age.
Does anybody know how to resample the bigger one based on the smaller sample's gender sexual orientation and age characteristics?

Thanx!
Andrea
Reply | Threaded
Open this post in threaded view
|

Re: post stratification based on statistic criteria

Maguin, Eugene
Although you do not say, let's suppose you wanted a 1 to 1 sample of the larger sample. A key issue is whether age is coded as ranges, e.g., 18-21, 22-28, etc or integer age, e.g., 23, 24, 26, etc. If age is integer, then there is a chance, and maybe a good chance, of not finding any matching cases.

I think this is how I'd do this. Let Smaller be the smaller sample; Larger be the larger sample. I assume no missing data on either age or gender in either file. I assume that both Larger and smaller contain only two variables: gender and age. I also haven't tried this out.

Get file='Smaller'.
Aggregate outfile='AggSmaller'/break=gender age/nn=nu.

Get file='Larger'.
Compute seq=rv.uniform(0 1).
Sort cases by gender age seq.
Do if ($casenum eq 1).
+   compute rec=1.
Else if (gender ne lag(gender) or age ne lag(age)).
+   compute rec=1.
Else.
+   compute rec=lag(rec)+1.
End if.
Match files file=*/table='AggSmaller'/by gender age.
Select if (rec le nn).
Execute.

Gene Maguin





-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of andrea1977
Sent: Wednesday, March 06, 2013 3:31 AM
To: [hidden email]
Subject: post stratification based on statistic criteria

Hi,

I have two samples to compare, one is smaller and let's say clinical, the other one is 4 times bigger and let's say non-clinical. they differ on several aspects such as gender sexual orientation and age.
Does anybody know how to resample the bigger one based on the smaller sample's gender sexual orientation and age characteristics?

Thanx!
Andrea




--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/post-stratification-based-on-statistic-criteria-tp5718402.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: post stratification based on statistic criteria

Rich Ulrich
In reply to this post by andrea1977
This time, I'll be the one to throw in the observation that it is generally
a bad idea to discard data.  The exception to this might be when you
ought to match the observed ranges.  Thus, if one sample has no cases
outside the ages of (say) 20 to 40,  it could be appropriate to define
both samples on that basis...  especially if you expect nonlinear age
effects outside of that range.  (On the other hand, it *might*  still be
useful to keep the whole range, to get a more precise determination of
the simple, linear age regression.  One answer does not fit all data.)

And if you have only a few transgendered subjects in the smaller, clinical
sample, you get your more powerful comparison by using as many from
the larger, non-clinical sample as possible. 

"Equal N"  is a useful *starting* point for *designing*  data collection
of some sorts, since that gives best power for comparisons where
the variances are equal in the two groups.  But even for those "paired
designs," it is often a better form of data analysis to use regression or
covariance procedures to do the actual testing, instead of using any
form of paired testing.

--
Rich Ulrich

> Date: Wed, 6 Mar 2013 00:30:30 -0800

> From: [hidden email]
> Subject: post stratification based on statistic criteria
> To: [hidden email]
>
> Hi,
>
> I have two samples to compare, one is smaller and let's say clinical, the
> other one is 4 times bigger and let's say non-clinical. they differ on
> several aspects such as gender sexual orientation and age.
> Does anybody know how to resample the bigger one based on the smaller
> sample's gender sexual orientation and age characteristics?
>
> Thanx!
>...