Hi everyone,
I was wondering if someone can help me in trying to select a random sample of a category within a variable. For example, the original variable is this: Var A 1 = 50,000 2 = 7,000 3 = 5,000 I want to create a new variable (Var B), of which 2nd & 3rd categories remain the same, while a 10% percent ransom sample of cases is drawn for the 1st category of Var A. The new variable will look something like this: Var B 1 = 5,000 (the category from which a 10% sample was drawn from the original variable) 2 = 7,000 3 = 5,000 Any advice/suggestion is greatly appreciated!! Thanks, Greg |
Compute a new variable that is a Uniform random number
RANDX between 0 and 1, minus VarA. Something like, COMPUTE randx= UNIFORM(1) - VarA. Thus, RANDX will always be negative, in ranges separate for VarA. Sort RANDX. Use the first 17,000 cases. I think you can use N of Cases 17000. Or, better yet, you can set a "filter" to specify the first 17000 -- That will make it easy to compare the "used" versus "dropped" cases for VarA= 1. Read about using FILTER if this is new to you. COMPUTE afilter= $CASENUM le 17000. -- Rich Ulrich > Date: Fri, 18 May 2012 17:01:20 -0700 > From: [hidden email] > Subject: random sample from a variable's category > To: [hidden email] > > Hi everyone, > > I was wondering if someone can help me in trying to select a random sample > of a category within a variable. > > For example, the original variable is this: > > Var A > 1 = 50,000 > 2 = 7,000 > 3 = 5,000 > > I want to create a new variable (Var B), of which 2nd & 3rd categories > remain the same, while a 10% percent ransom sample of cases is drawn for the > 1st category of Var A. > > The new variable will look something like this: > > Var B > 1 = 5,000 (the category from which a 10% sample was drawn from the original > variable) > 2 = 7,000 > 3 = 5,000 > > Any advice/suggestion is greatly appreciated!! > |
In reply to this post by Greg
Hi Grigoris,
What exactly do you want to do with category 1? For instance, if you have 100 cases of whom 5 have category 1, then you'll need to replace only those 5 values (in this case 5% of your sample), right? So where exactly does the "10 %" you mention come from? And how exactly do you wish to sample those 5 values? From all 100 original values or only from the 95 category 2 and 3 responses? And do you want to sample those with or without replacement? Best, Ruben > Date: Fri, 18 May 2012 17:01:20 -0700 > From: [hidden email] > Subject: random sample from a variable's category > To: [hidden email] > > Hi everyone, > > I was wondering if someone can help me in trying to select a random sample > of a category within a variable. > > For example, the original variable is this: > > Var A > 1 = 50,000 > 2 = 7,000 > 3 = 5,000 > > I want to create a new variable (Var B), of which 2nd & 3rd categories > remain the same, while a 10% percent ransom sample of cases is drawn for the > 1st category of Var A. > > The new variable will look something like this: > > Var B > 1 = 5,000 (the category from which a 10% sample was drawn from the original > variable) > 2 = 7,000 > 3 = 5,000 > > Any advice/suggestion is greatly appreciated!! > > Thanks, > Greg > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/random-sample-from-a-variable-s-category-tp5712164.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD |
Free forum by Nabble | Edit this page |