Hi, It is possible to select random sample of cases by groups? I have an “age” variable and I would like to use it to select random sample for each of the following age bands: 1-16-24 2-25-34 3-35-44 4-45-54 5-55-64 6-65+ I also would like to get different sample sizes in each group. Let’s say: 1-16-24 (48 cases) 2-25-34 (100 cases) 3-35-44 (150 cases) 4-45-54 (125 cases) 5-55-64 (86 cases) 6-65+ (15 cases) Thanks in advance!!! Joan Casellas Vega Media Research Analyst Phone: +44 20 7593 1585 |
Hi Joan,
Do this (untested).
compute rn=uniform(1).
rank variables=rn by agegrp.
compute pick=0.
do if (agegrp eq '16-24').
+ if (rrn le 48) pick=1.
else if (agegrp eq '25-34').
+ if (rrn le 100) pick=1.
else
if (agegrp eq '35-44').
+ if (rrn le 150) pick=1.
else if (agegrp eq '45-54').
+ if (rrn le 125) pick=1.
else if (agegrp eq '55-64').
+ if (rrn le 86) pick=1.
else
if (agegrp eq '65+').
+ if (rrn le 15) pick=1.
end if. select if (pick eq 1). frequencies agegrp. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of joan casellas Sent: Friday, August 12, 2011 11:27 AM To: [hidden email] Subject: Re: random sample of cases by groups Hi, It is possible to select random
sample of cases by groups? I have an “age” variable and I would like to use it
to select random sample for each of the following age
bands: 1-16-24 2-25-34 3-35-44 4-45-54 5-55-64 6-65+ I also would like to get different sample sizes in each group. Let’s
say: 1-16-24 (48
cases) 2-25-34 (100
cases) 3-35-44 (150
cases) 4-45-54 (125
cases) 5-55-64 (86
cases) 6-65+ (15
cases) Thanks in
advance!!! Joan
Casellas
Vega
Media
Research Analyst Phone:
+44
20 7593 1585
|
Administrator
|
In reply to this post by joan casellas
Something like this.
COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY AGE SCRAMBLE. IF $CASENUM=1 OR (LAG(age) NE age) Counter=1. IF MISSING(Counter) Counter=LAG(Counter)+1. COMPUTE Keeper=Age. RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15). *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then remove *. SELECT IF (Counter LE Keeper). FREQ Age.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by joan casellas
Use RECODE or a nested DO-IF to create your age group variable if you don't already have it. Then compute a random variable using one of the RV functions, such as RV.UNIFORM. After that, sort by Age Group and the random variable--this will (pseudo) randomly order the records within each age group. Then number the records within each age group (using LAG).
do if ($casenum EQ 1) OR (AgeGroup NE Lag(AgeGroup)). - compute recnum = 1. else. - compute recnum = LAG(recnum) + 1. end if. And finally, compute a filter variable that flags the cases you want to use. A DO-REPEAT would work well here, I think. Something like: do repeat g = 1 to 6 / n = 48 100 150 125 86 15 . - if AgeGroup EQ g flag = recnum LE n. end repeat. Then filter by FLAG, and do whatever it is you want to do. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by David Marso
Hi David,
Thanks for your email. So far this syntax is the most accurate in terms of what I'm looking for, but I'm not sure if it could be improve. Normally in SPSS when selecting cases of a random sample you have to options: Approximately XX % of all cases Exactly XX cases from the first XXXX cases Using your syntax I don't get exactly number, but approximations. Could you advice in how to get exactly number of cases for each group. Thanks in advance! Joan Casellas Vega Media Research Analyst Phone: +44 20 7593 1585 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: 12 August 2011 17:01 To: [hidden email] Subject: Re: random sample of cases by groups Something like this. COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY AGE SCRAMBLE. IF $CASENUM=1 OR (LAG(age) NE age) Counter=1. IF MISSING(Counter) Counter=LAG(Counter)+1. COMPUTE Keeper=Age. RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15). *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then remove *. SELECT IF (Counter LE Keeper). FREQ Age. joan casellas wrote: > > Hi, > > > > It is possible to select random sample of cases by groups? I have an "age" > variable and I would like to use it to select random sample for each of > the > following age bands: > > > > 1-16-24 > > 2-25-34 > > 3-35-44 > > 4-45-54 > > 5-55-64 > > 6-65+ > > > > I also would like to get different sample sizes in each group. Let's say: > > > > 1-16-24 (48 cases) > > 2-25-34 (100 cases) > > 3-35-44 (150 cases) > > 4-45-54 (125 cases) > > 5-55-64 (86 cases) > > 6-65+ (15 cases) > > > > Thanks in advance!!! > > > > > > Joan Casellas Vega > > Media Research Analyst > > Phone: +44 20 7593 1585 > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Re-random-sample-of-cases-by-g roups-tp4693710p4693819.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Joan,
The syntax I posted *SHOULD& give you the exact desired numbers from each age strata. Please post frequencies of 'age' before the data selection, the *EXACT* syntax you are running and the frequencies of 'age' after the data selection or filter. David --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
For example:
I run this exactly and get precisely the requested freqs. Maybe you *DON'T* have enough cases in one or more age categories? INPUT PROGRAM. LOOP CASENUM=1 TO 1000. END CASE. END LOOP. END FILE. END INPUT PROGRAM. COMPUTE Age=Trunc(UNIFORM(6))+1. FREQ Age. ----- COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY AGE SCRAMBLE. IF $CASENUM=1 OR (LAG(age) NE age) Counter=1. IF MISSING(Counter) Counter=LAG(Counter)+1. COMPUTE Keeper=Age. RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15). *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then remove *. SELECT IF (Counter LE Keeper). FREQ Age. BEFORE: AGE Frequency Valid 1.00 171 2.00 158 3.00 169 4.00 168 5.00 173 6.00 161 Total 1000 AFTER: AGE Frequency Percent Valid Percent Cumulative Percent Valid 1.00 48 2.00 100 3.00 150 4.00 125 5.00 86 6.00 15 Total 524
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by David Marso
I had the weighting on!!!
Thanks, your syntax works like a charm!!! Joan Casellas Vega Media Research Analyst Phone: +44 20 7593 1585 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: 12 August 2011 17:01 To: [hidden email] Subject: Re: random sample of cases by groups Something like this. COMPUTE SCRAMBLE=UNIFORM(1). SORT CASES BY AGE SCRAMBLE. IF $CASENUM=1 OR (LAG(age) NE age) Counter=1. IF MISSING(Counter) Counter=LAG(Counter)+1. COMPUTE Keeper=Age. RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15). *EXECUTE . /* Probably DON'T need EXE here. If you get odd results then remove *. SELECT IF (Counter LE Keeper). FREQ Age. joan casellas wrote: > > Hi, > > > > It is possible to select random sample of cases by groups? I have an "age" > variable and I would like to use it to select random sample for each of > the > following age bands: > > > > 1-16-24 > > 2-25-34 > > 3-35-44 > > 4-45-54 > > 5-55-64 > > 6-65+ > > > > I also would like to get different sample sizes in each group. Let's say: > > > > 1-16-24 (48 cases) > > 2-25-34 (100 cases) > > 3-35-44 (150 cases) > > 4-45-54 (125 cases) > > 5-55-64 (86 cases) > > 6-65+ (15 cases) > > > > Thanks in advance!!! > > > > > > Joan Casellas Vega > > Media Research Analyst > > Phone: +44 20 7593 1585 > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Re-random-sample-of-cases-by-g roups-tp4693710p4693819.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |