Re: random sample of cases by groups

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

joan casellas

Hi,

 

It is possible to select random sample of cases by groups? I have an “age” variable and I would like to use it to select random sample for each of the following age bands:

 

1-16-24

2-25-34

3-35-44

4-45-54

5-55-64

6-65+

 

I also would like to get different sample sizes in each group. Let’s say:

 

1-16-24 (48 cases)

2-25-34 (100 cases)

3-35-44 (150 cases)

4-45-54 (125 cases)

5-55-64 (86 cases)

6-65+ (15 cases)

 

Thanks in advance!!!

 

 

Joan Casellas Vega                                             

Media Research Analyst

Phone: +44 20 7593 1585                                     

 

 

Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

Maguin, Eugene
Hi Joan,
 
Do this (untested).
 
compute rn=uniform(1).
rank variables=rn by agegrp.
compute pick=0.
do if (agegrp eq '16-24').
+  if (rrn le 48) pick=1.
else if (agegrp eq '25-34').
+  if (rrn le 100) pick=1.
else if (agegrp eq '35-44').
+  if (rrn le 150) pick=1.

else if (agegrp eq '45-54').
+  if (rrn le 125) pick=1.

else if (agegrp eq '55-64').
+  if (rrn le 86) pick=1.

else if (agegrp eq '65+').
+  if (rrn le 15) pick=1.

end if.

select if (pick eq 1).

frequencies agegrp.

 
 
Gene Maguin


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of joan casellas
Sent: Friday, August 12, 2011 11:27 AM
To: [hidden email]
Subject: Re: random sample of cases by groups

Hi,

 

It is possible to select random sample of cases by groups? I have an “age” variable and I would like to use it to select random sample for each of the following age bands:

 

1-16-24

2-25-34

3-35-44

4-45-54

5-55-64

6-65+

 

I also would like to get different sample sizes in each group. Let’s say:

 

1-16-24 (48 cases)

2-25-34 (100 cases)

3-35-44 (150 cases)

4-45-54 (125 cases)

5-55-64 (86 cases)

6-65+ (15 cases)

 

Thanks in advance!!!

 

 

Joan Casellas Vega                                             

Media Research Analyst

Phone: +44 20 7593 1585                                     

 

 

Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

David Marso
Administrator
In reply to this post by joan casellas
Something like this.
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY AGE SCRAMBLE.
IF $CASENUM=1 OR (LAG(age) NE age) Counter=1.
IF MISSING(Counter) Counter=LAG(Counter)+1.
COMPUTE Keeper=Age.
RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15).
*EXECUTE .  /* Probably DON'T need EXE here.  If you get odd results then remove *.
SELECT IF (Counter LE Keeper).
FREQ Age.

joan casellas wrote
Hi,



It is possible to select random sample of cases by groups? I have an "age"
variable and I would like to use it to select random sample for each of the
following age bands:



1-16-24

2-25-34

3-35-44

4-45-54

5-55-64

6-65+



I also would like to get different sample sizes in each group. Let's say:



1-16-24 (48 cases)

2-25-34 (100 cases)

3-35-44 (150 cases)

4-45-54 (125 cases)

5-55-64 (86 cases)

6-65+ (15 cases)



Thanks in advance!!!





Joan Casellas Vega

Media Research Analyst

Phone: +44 20 7593 1585
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

Bruce Weaver
Administrator
In reply to this post by joan casellas
Use RECODE or a nested DO-IF to create your age group variable if you don't already have it.  Then compute a random variable using one of the RV functions, such as RV.UNIFORM.  After that, sort by Age Group and the random variable--this will (pseudo) randomly order the records within each age group.  Then number the records within each age group (using LAG).

do if ($casenum EQ 1) OR (AgeGroup NE Lag(AgeGroup)).
- compute recnum = 1.
else.
- compute recnum = LAG(recnum) + 1.
end if.

And finally, compute a filter variable that flags the cases you want to use.  A DO-REPEAT would work well here, I think.  Something like:

do repeat g = 1 to 6 / n = 48 100 150 125 86 15 .
- if AgeGroup EQ g flag = recnum LE n.
end repeat.

Then filter by FLAG, and do whatever it is you want to do.

HTH.



joan casellas wrote
Hi,



It is possible to select random sample of cases by groups? I have an "age"
variable and I would like to use it to select random sample for each of the
following age bands:



1-16-24

2-25-34

3-35-44

4-45-54

5-55-64

6-65+



I also would like to get different sample sizes in each group. Let's say:



1-16-24 (48 cases)

2-25-34 (100 cases)

3-35-44 (150 cases)

4-45-54 (125 cases)

5-55-64 (86 cases)

6-65+ (15 cases)



Thanks in advance!!!





Joan Casellas Vega

Media Research Analyst

Phone: +44 20 7593 1585
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

joan casellas
In reply to this post by David Marso
Hi David,

Thanks for your email. So far this syntax is the most accurate in terms of
what I'm looking for, but I'm not sure if it could be improve. Normally in
SPSS when selecting cases of a random sample you have to options:

        Approximately XX % of all cases
        Exactly XX cases from the first XXXX cases

Using your syntax I don't get exactly number, but approximations. Could you
advice in how to get exactly number of cases for each group.

Thanks in advance!


Joan Casellas Vega
Media Research Analyst
Phone: +44 20 7593 1585

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 12 August 2011 17:01
To: [hidden email]
Subject: Re: random sample of cases by groups

Something like this.
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY AGE SCRAMBLE.
IF $CASENUM=1 OR (LAG(age) NE age) Counter=1.
IF MISSING(Counter) Counter=LAG(Counter)+1.
COMPUTE Keeper=Age.
RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15).
*EXECUTE .  /* Probably DON'T need EXE here.  If you get odd results then
remove *.
SELECT IF (Counter LE Keeper).
FREQ Age.


joan casellas wrote:

>
> Hi,
>
>
>
> It is possible to select random sample of cases by groups? I have an "age"
> variable and I would like to use it to select random sample for each of
> the
> following age bands:
>
>
>
> 1-16-24
>
> 2-25-34
>
> 3-35-44
>
> 4-45-54
>
> 5-55-64
>
> 6-65+
>
>
>
> I also would like to get different sample sizes in each group. Let's say:
>
>
>
> 1-16-24 (48 cases)
>
> 2-25-34 (100 cases)
>
> 3-35-44 (150 cases)
>
> 4-45-54 (125 cases)
>
> 5-55-64 (86 cases)
>
> 6-65+ (15 cases)
>
>
>
> Thanks in advance!!!
>
>
>
>
>
> Joan Casellas Vega
>
> Media Research Analyst
>
> Phone: +44 20 7593 1585
>


--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Re-random-sample-of-cases-by-g
roups-tp4693710p4693819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

David Marso
Administrator
Joan,
The syntax I posted *SHOULD& give you the exact desired numbers from each age strata.
Please post frequencies of 'age'  before the data selection, the *EXACT* syntax you are running and the frequencies of 'age' after the data selection or filter.
David
--

joan casellas wrote
Hi David,

Thanks for your email. So far this syntax is the most accurate in terms of
what I'm looking for, but I'm not sure if it could be improve. Normally in
SPSS when selecting cases of a random sample you have to options:

        Approximately XX % of all cases
        Exactly XX cases from the first XXXX cases

Using your syntax I don't get exactly number, but approximations. Could you
advice in how to get exactly number of cases for each group.

Thanks in advance!


Joan Casellas Vega
Media Research Analyst
Phone: +44 20 7593 1585

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 12 August 2011 17:01
To: [hidden email]
Subject: Re: random sample of cases by groups

Something like this.
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY AGE SCRAMBLE.
IF $CASENUM=1 OR (LAG(age) NE age) Counter=1.
IF MISSING(Counter) Counter=LAG(Counter)+1.
COMPUTE Keeper=Age.
RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15).
*EXECUTE .  /* Probably DON'T need EXE here.  If you get odd results then
remove *.
SELECT IF (Counter LE Keeper).
FREQ Age.


joan casellas wrote:
>
> Hi,
>
>
>
> It is possible to select random sample of cases by groups? I have an "age"
> variable and I would like to use it to select random sample for each of
> the
> following age bands:
>
>
>
> 1-16-24
>
> 2-25-34
>
> 3-35-44
>
> 4-45-54
>
> 5-55-64
>
> 6-65+
>
>
>
> I also would like to get different sample sizes in each group. Let's say:
>
>
>
> 1-16-24 (48 cases)
>
> 2-25-34 (100 cases)
>
> 3-35-44 (150 cases)
>
> 4-45-54 (125 cases)
>
> 5-55-64 (86 cases)
>
> 6-65+ (15 cases)
>
>
>
> Thanks in advance!!!
>
>
>
>
>
> Joan Casellas Vega
>
> Media Research Analyst
>
> Phone: +44 20 7593 1585
>


--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Re-random-sample-of-cases-by-g
roups-tp4693710p4693819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

David Marso
Administrator
For example:
I run this exactly and get precisely the requested freqs.  
Maybe you *DON'T* have enough cases in one or more age categories?

INPUT PROGRAM.
LOOP CASENUM=1 TO 1000.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
COMPUTE Age=Trunc(UNIFORM(6))+1.
FREQ Age.
-----
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY AGE SCRAMBLE.
IF $CASENUM=1 OR (LAG(age) NE age) Counter=1.
IF MISSING(Counter) Counter=LAG(Counter)+1.
COMPUTE Keeper=Age.
RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15).
*EXECUTE .  /* Probably DON'T need EXE here.  If you get odd results then remove *.
SELECT IF (Counter LE Keeper).
FREQ Age.





BEFORE:
AGE
  Frequency
Valid 1.00 171
        2.00 158
        3.00 169
        4.00 168
        5.00 173
        6.00 161
        Total 1000

AFTER:
AGE
  Frequency Percent Valid Percent Cumulative Percent
Valid 1.00 48
        2.00 100
        3.00 150
        4.00 125
        5.00 86
        6.00 15
        Total 524

               


Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: random sample of cases by groups

joan casellas
In reply to this post by David Marso
I had the weighting on!!!

Thanks, your syntax works like a charm!!!



Joan Casellas Vega
Media Research Analyst
Phone: +44 20 7593 1585


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 12 August 2011 17:01
To: [hidden email]
Subject: Re: random sample of cases by groups

Something like this.
COMPUTE SCRAMBLE=UNIFORM(1).
SORT CASES BY AGE SCRAMBLE.
IF $CASENUM=1 OR (LAG(age) NE age) Counter=1.
IF MISSING(Counter) Counter=LAG(Counter)+1.
COMPUTE Keeper=Age.
RECODE Keeper (1=48)(2=100)(3=150)(4=125)(5=86)(6=15).
*EXECUTE .  /* Probably DON'T need EXE here.  If you get odd results then
remove *.
SELECT IF (Counter LE Keeper).
FREQ Age.


joan casellas wrote:

>
> Hi,
>
>
>
> It is possible to select random sample of cases by groups? I have an "age"
> variable and I would like to use it to select random sample for each of
> the
> following age bands:
>
>
>
> 1-16-24
>
> 2-25-34
>
> 3-35-44
>
> 4-45-54
>
> 5-55-64
>
> 6-65+
>
>
>
> I also would like to get different sample sizes in each group. Let's say:
>
>
>
> 1-16-24 (48 cases)
>
> 2-25-34 (100 cases)
>
> 3-35-44 (150 cases)
>
> 4-45-54 (125 cases)
>
> 5-55-64 (86 cases)
>
> 6-65+ (15 cases)
>
>
>
> Thanks in advance!!!
>
>
>
>
>
> Joan Casellas Vega
>
> Media Research Analyst
>
> Phone: +44 20 7593 1585
>


--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Re-random-sample-of-cases-by-g
roups-tp4693710p4693819.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD