PPS sampling

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

PPS sampling

Cathal McCrory
I have the enrolment numbers for 3200 schools and I want to select 13,000
pupils from within these schools on a PPS basis. For example:

                        Pupil                   Cumulative
School                  Enrolment               Population
1                       116                     116
2                       163                     279
3                       232                     511
4                       204                     715
5                       274                     989
6                       188                     1177
7                       210                     1387
8                       407                     1794
9                       298                     2092

I want to generate a random start within SPSS and select every 4th pupil
from within the cumulative population total (rather than a set number of
cases) and have the program iterate until I have drawn my sample of 13,000.
I was wondering whether anyone could give me any pointers (to existing
routines) or offer any guidance on this. Many thanks.

Cathal Mc Crory
Reply | Threaded
Open this post in threaded view
|

Re: PPS sampling

Joseph Teitelman temp2
You are describing a systematic sampling design. A PPS sampling design
means a sample with probabilties proportional to size.
They are entirely distinct sampling designs. Whether you wish to choose
one of these sampling methods is dependent upon various factors.  I
would suspect that in your case you can use systematic sampling

>>> Cathal McCrory <[hidden email]> 7/7/2006 11:05 AM >>>
I have the enrolment numbers for 3200 schools and I want to select
13,000
pupils from within these schools on a PPS basis. For example:

                        Pupil                   Cumulative
School                  Enrolment               Population
1                       116                     116
2                       163                     279
3                       232                     511
4                       204                     715
5                       274                     989
6                       188                     1177
7                       210                     1387
8                       407                     1794
9                       298                     2092

I want to generate a random start within SPSS and select every 4th
pupil
from within the cumulative population total (rather than a set number
of
cases) and have the program iterate until I have drawn my sample of
13,000.
I was wondering whether anyone could give me any pointers (to existing
routines) or offer any guidance on this. Many thanks.

Cathal Mc Crory
Reply | Threaded
Open this post in threaded view
|

Re: PPS sampling

Jamie Burnett
In reply to this post by Cathal McCrory
You can use complex samples in SPSS to draw the sample for you, or alternatively for something like this you could draw it in excel, as long as u take a 1 in N across all pupils then pupils from schools with more pupils will have a higher probability of selection.  
If you did it in SPSS then it can work out the standard erros of your point estimates due to the complex nature of the sampling frame. If that is of use to you?
 
Thanks
 
Jamie

________________________________

From: SPSSX(r) Discussion on behalf of Joseph Teitelman temp2
Sent: Fri 7/7/2006 4:41 PM
To: [hidden email]
Subject: Re: PPS sampling



You are describing a systematic sampling design. A PPS sampling design
means a sample with probabilties proportional to size.
They are entirely distinct sampling designs. Whether you wish to choose
one of these sampling methods is dependent upon various factors.  I
would suspect that in your case you can use systematic sampling

>>> Cathal McCrory <[hidden email]> 7/7/2006 11:05 AM >>>
I have the enrolment numbers for 3200 schools and I want to select
13,000
pupils from within these schools on a PPS basis. For example:

                        Pupil                   Cumulative
School                  Enrolment               Population
1                       116                     116
2                       163                     279
3                       232                     511
4                       204                     715
5                       274                     989
6                       188                     1177
7                       210                     1387
8                       407                     1794
9                       298                     2092

I want to generate a random start within SPSS and select every 4th
pupil
from within the cumulative population total (rather than a set number
of
cases) and have the program iterate until I have drawn my sample of
13,000.
I was wondering whether anyone could give me any pointers (to existing
routines) or offer any guidance on this. Many thanks.

Cathal Mc Crory




============================
This e-mail and all attachments it may contain is confidential and intended solely for the use of the individual to whom it is addressed. Any views or opinions presented are solely those of the author and do not necessarily represent those of Ipsos MORI and its associated companies. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, printing, forwarding or copying of this e-mail is strictly prohibited. Please contact the sender if you have received this e-mail in error.
============================
Reply | Threaded
Open this post in threaded view
|

Re: PPS sampling

Reutter, Alex
In reply to this post by Cathal McCrory
Using the CSPLAN and CSSELECT procedures in the Complex Samples option is a fairly painless way to handle this in SPSS.  If I correctly understand what you want to do, you will need a "measure of size" variable.  For example:

Case/Pupil      School  SchoolSize
1               1               116
2               1               116
...
116             1               116
117             2               163
118             2               163
...

Then work through the Sampling Wizard.  You should end up with something like:

CSPLAN SAMPLE
  /PLAN FILE='mysample.csplan'
  /METHOD TYPE=PPS_SYSTEMATIC
  /MOS VARIABLE= SchoolSize
  /RATE VALUE=0.25.
CSSELECT
  /PLAN FILE='mysample.csplan'.

The sampling plan is defined by CSPLAN (systematic sampling of 1/4 pupils with probability proportional to school size) and the specifications are saved to the external file mysample.csplan.  CSSELECT uses the information in this file to carry out the actual sampling.  Analysis procedures in the Complex Samples option use the information this file to ensure correct computation of statistics according to the complex sampling plan.

Cheers,
Alex



> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Cathal McCrory
> Sent: Friday, July 07, 2006 10:06 AM
> To: [hidden email]
> Subject: PPS sampling
>
> I have the enrolment numbers for 3200 schools and I want to select 13,000
> pupils from within these schools on a PPS basis. For example:
>
>                         Pupil                   Cumulative
> School                  Enrolment               Population
> 1                       116                     116
> 2                       163                     279
> 3                       232                     511
> 4                       204                     715
> 5                       274                     989
> 6                       188                     1177
> 7                       210                     1387
> 8                       407                     1794
> 9                       298                     2092
>
> I want to generate a random start within SPSS and select every 4th pupil
> from within the cumulative population total (rather than a set number of
> cases) and have the program iterate until I have drawn my sample of
> 13,000.
> I was wondering whether anyone could give me any pointers (to existing
> routines) or offer any guidance on this. Many thanks.
>
> Cathal Mc Crory
Reply | Threaded
Open this post in threaded view
|

Re: PPS sampling

Marks, Jim
In reply to this post by Cathal McCrory
Is this what you want?

** Every 4th case, starting on case 3, with the sample proportionate to
the school size.

** sample data (use your sample of individual students.
data list free /school (f8.0).
BEGIN DATA
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
END DATA.
FREQ school.

SORT CASES BY school.
COMPUTE sel_order = mod($CASENUM,4).
COMPUTE sample = sel_order EQ 3.
FILTER BY sample.
FREQ school.

The operations In your file of individuals students, 1) sort on the
variable of interest, 2) calculute the modulus of the casenumber,
3)select one value of the modulus for your sample, 4) filter or delete
unselected cases.

Notice that the samples are not exact proportions. In my sample data, we
have
        5/20 (25%) for school = 1
        7/30 (23.3%)for school = 2
        7/25 (28%) for school = 3
        19/75 (25.3%) for the complete sample.

If you want to randomize within schools, calculate a random value for
each student, and use that in the sort command:

COMPUTE rndize = uniform(1).
SORT CASES BY school rndize.

Continue with rest of the COMPUTE statements.
--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Cathal McCrory
Sent: Friday, July 07, 2006 10:06 AM
To: [hidden email]
Subject: PPS sampling

I have the enrolment numbers for 3200 schools and I want to select
13,000 pupils from within these schools on a PPS basis. For example:

                        Pupil                   Cumulative
School                  Enrolment               Population
1                       116                     116
2                       163                     279
3                       232                     511
4                       204                     715
5                       274                     989
6                       188                     1177
7                       210                     1387
8                       407                     1794
9                       298                     2092

I want to generate a random start within SPSS and select every 4th pupil
from within the cumulative population total (rather than a set number of
cases) and have the program iterate until I have drawn my sample of
13,000.
I was wondering whether anyone could give me any pointers (to existing
routines) or offer any guidance on this. Many thanks.

Cathal Mc Crory