random sample from a variable's category

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

random sample from a variable's category

Greg
Hi everyone,

I was wondering if someone can help me in trying to select a random sample of a category within a variable.

For example, the original variable is this:

Var A
1 = 50,000
2 = 7,000
3 = 5,000

I want to create a new variable (Var B), of which 2nd & 3rd categories remain the same, while a 10% percent ransom sample of cases is drawn for the 1st category of Var A.

The new variable will look something like this:

Var B
1 = 5,000 (the category from which a 10% sample was drawn from the original variable)
2 = 7,000
3 = 5,000

Any advice/suggestion is greatly appreciated!!

Thanks,
Greg
Reply | Threaded
Open this post in threaded view
|

Re: random sample from a variable's category

Rich Ulrich
Compute a new variable that is a Uniform random number
RANDX between 0 and 1, minus VarA.  Something like,

COMPUTE  randx= UNIFORM(1) - VarA.

Thus, RANDX will always be negative, in ranges separate for VarA.

Sort  RANDX.

Use the first 17,000 cases.  I think you can use
N of Cases  17000.   Or, better yet, you can set a "filter"
to specify the first 17000 -- That will make it easy
to compare the "used" versus "dropped" cases for
VarA= 1.    Read about using FILTER if this is new to you.

COMPUTE afilter= $CASENUM le 17000.
--
Rich Ulrich

> Date: Fri, 18 May 2012 17:01:20 -0700

> From: [hidden email]
> Subject: random sample from a variable's category
> To: [hidden email]
>
> Hi everyone,
>
> I was wondering if someone can help me in trying to select a random sample
> of a category within a variable.
>
> For example, the original variable is this:
>
> Var A
> 1 = 50,000
> 2 = 7,000
> 3 = 5,000
>
> I want to create a new variable (Var B), of which 2nd & 3rd categories
> remain the same, while a 10% percent ransom sample of cases is drawn for the
> 1st category of Var A.
>
> The new variable will look something like this:
>
> Var B
> 1 = 5,000 (the category from which a 10% sample was drawn from the original
> variable)
> 2 = 7,000
> 3 = 5,000
>
> Any advice/suggestion is greatly appreciated!!
>
...
Reply | Threaded
Open this post in threaded view
|

Re: random sample from a variable's category

Ruben Geert van den Berg
In reply to this post by Greg
Hi Grigoris,

What exactly do you want to do with category 1? For instance, if you have 100 cases of whom 5 have category 1, then you'll need to replace only those 5 values (in this case 5% of your sample), right? So where exactly does the "10 %" you mention come from? 

And how exactly do you wish to sample those 5 values? From all 100 original values or only from the 95 category 2 and 3 responses? And do you want to sample those with or without replacement?

Best,

Ruben

> Date: Fri, 18 May 2012 17:01:20 -0700

> From: [hidden email]
> Subject: random sample from a variable's category
> To: [hidden email]
>
> Hi everyone,
>
> I was wondering if someone can help me in trying to select a random sample
> of a category within a variable.
>
> For example, the original variable is this:
>
> Var A
> 1 = 50,000
> 2 = 7,000
> 3 = 5,000
>
> I want to create a new variable (Var B), of which 2nd & 3rd categories
> remain the same, while a 10% percent ransom sample of cases is drawn for the
> 1st category of Var A.
>
> The new variable will look something like this:
>
> Var B
> 1 = 5,000 (the category from which a 10% sample was drawn from the original
> variable)
> 2 = 7,000
> 3 = 5,000
>
> Any advice/suggestion is greatly appreciated!!
>
> Thanks,
> Greg
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/random-sample-from-a-variable-s-category-tp5712164.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD