I have the enrolment numbers for 3200 schools and I want to select 13,000
pupils from within these schools on a PPS basis. For example: Pupil Cumulative School Enrolment Population 1 116 116 2 163 279 3 232 511 4 204 715 5 274 989 6 188 1177 7 210 1387 8 407 1794 9 298 2092 I want to generate a random start within SPSS and select every 4th pupil from within the cumulative population total (rather than a set number of cases) and have the program iterate until I have drawn my sample of 13,000. I was wondering whether anyone could give me any pointers (to existing routines) or offer any guidance on this. Many thanks. Cathal Mc Crory |
You are describing a systematic sampling design. A PPS sampling design
means a sample with probabilties proportional to size. They are entirely distinct sampling designs. Whether you wish to choose one of these sampling methods is dependent upon various factors. I would suspect that in your case you can use systematic sampling >>> Cathal McCrory <[hidden email]> 7/7/2006 11:05 AM >>> I have the enrolment numbers for 3200 schools and I want to select 13,000 pupils from within these schools on a PPS basis. For example: Pupil Cumulative School Enrolment Population 1 116 116 2 163 279 3 232 511 4 204 715 5 274 989 6 188 1177 7 210 1387 8 407 1794 9 298 2092 I want to generate a random start within SPSS and select every 4th pupil from within the cumulative population total (rather than a set number of cases) and have the program iterate until I have drawn my sample of 13,000. I was wondering whether anyone could give me any pointers (to existing routines) or offer any guidance on this. Many thanks. Cathal Mc Crory |
In reply to this post by Cathal McCrory
You can use complex samples in SPSS to draw the sample for you, or alternatively for something like this you could draw it in excel, as long as u take a 1 in N across all pupils then pupils from schools with more pupils will have a higher probability of selection.
If you did it in SPSS then it can work out the standard erros of your point estimates due to the complex nature of the sampling frame. If that is of use to you? Thanks Jamie ________________________________ From: SPSSX(r) Discussion on behalf of Joseph Teitelman temp2 Sent: Fri 7/7/2006 4:41 PM To: [hidden email] Subject: Re: PPS sampling You are describing a systematic sampling design. A PPS sampling design means a sample with probabilties proportional to size. They are entirely distinct sampling designs. Whether you wish to choose one of these sampling methods is dependent upon various factors. I would suspect that in your case you can use systematic sampling >>> Cathal McCrory <[hidden email]> 7/7/2006 11:05 AM >>> I have the enrolment numbers for 3200 schools and I want to select 13,000 pupils from within these schools on a PPS basis. For example: Pupil Cumulative School Enrolment Population 1 116 116 2 163 279 3 232 511 4 204 715 5 274 989 6 188 1177 7 210 1387 8 407 1794 9 298 2092 I want to generate a random start within SPSS and select every 4th pupil from within the cumulative population total (rather than a set number of cases) and have the program iterate until I have drawn my sample of 13,000. I was wondering whether anyone could give me any pointers (to existing routines) or offer any guidance on this. Many thanks. Cathal Mc Crory ============================ This e-mail and all attachments it may contain is confidential and intended solely for the use of the individual to whom it is addressed. Any views or opinions presented are solely those of the author and do not necessarily represent those of Ipsos MORI and its associated companies. If you are not the intended recipient, be advised that you have received this e-mail in error and that any use, dissemination, printing, forwarding or copying of this e-mail is strictly prohibited. Please contact the sender if you have received this e-mail in error. ============================ |
In reply to this post by Cathal McCrory
Using the CSPLAN and CSSELECT procedures in the Complex Samples option is a fairly painless way to handle this in SPSS. If I correctly understand what you want to do, you will need a "measure of size" variable. For example:
Case/Pupil School SchoolSize 1 1 116 2 1 116 ... 116 1 116 117 2 163 118 2 163 ... Then work through the Sampling Wizard. You should end up with something like: CSPLAN SAMPLE /PLAN FILE='mysample.csplan' /METHOD TYPE=PPS_SYSTEMATIC /MOS VARIABLE= SchoolSize /RATE VALUE=0.25. CSSELECT /PLAN FILE='mysample.csplan'. The sampling plan is defined by CSPLAN (systematic sampling of 1/4 pupils with probability proportional to school size) and the specifications are saved to the external file mysample.csplan. CSSELECT uses the information in this file to carry out the actual sampling. Analysis procedures in the Complex Samples option use the information this file to ensure correct computation of statistics according to the complex sampling plan. Cheers, Alex > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Cathal McCrory > Sent: Friday, July 07, 2006 10:06 AM > To: [hidden email] > Subject: PPS sampling > > I have the enrolment numbers for 3200 schools and I want to select 13,000 > pupils from within these schools on a PPS basis. For example: > > Pupil Cumulative > School Enrolment Population > 1 116 116 > 2 163 279 > 3 232 511 > 4 204 715 > 5 274 989 > 6 188 1177 > 7 210 1387 > 8 407 1794 > 9 298 2092 > > I want to generate a random start within SPSS and select every 4th pupil > from within the cumulative population total (rather than a set number of > cases) and have the program iterate until I have drawn my sample of > 13,000. > I was wondering whether anyone could give me any pointers (to existing > routines) or offer any guidance on this. Many thanks. > > Cathal Mc Crory |
In reply to this post by Cathal McCrory
Is this what you want?
** Every 4th case, starting on case 3, with the sample proportionate to the school size. ** sample data (use your sample of individual students. data list free /school (f8.0). BEGIN DATA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 END DATA. FREQ school. SORT CASES BY school. COMPUTE sel_order = mod($CASENUM,4). COMPUTE sample = sel_order EQ 3. FILTER BY sample. FREQ school. The operations In your file of individuals students, 1) sort on the variable of interest, 2) calculute the modulus of the casenumber, 3)select one value of the modulus for your sample, 4) filter or delete unselected cases. Notice that the samples are not exact proportions. In my sample data, we have 5/20 (25%) for school = 1 7/30 (23.3%)for school = 2 7/25 (28%) for school = 3 19/75 (25.3%) for the complete sample. If you want to randomize within schools, calculate a random value for each student, and use that in the sort command: COMPUTE rndize = uniform(1). SORT CASES BY school rndize. Continue with rest of the COMPUTE statements. --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cathal McCrory Sent: Friday, July 07, 2006 10:06 AM To: [hidden email] Subject: PPS sampling I have the enrolment numbers for 3200 schools and I want to select 13,000 pupils from within these schools on a PPS basis. For example: Pupil Cumulative School Enrolment Population 1 116 116 2 163 279 3 232 511 4 204 715 5 274 989 6 188 1177 7 210 1387 8 407 1794 9 298 2092 I want to generate a random start within SPSS and select every 4th pupil from within the cumulative population total (rather than a set number of cases) and have the program iterate until I have drawn my sample of 13,000. I was wondering whether anyone could give me any pointers (to existing routines) or offer any guidance on this. Many thanks. Cathal Mc Crory |
Free forum by Nabble | Edit this page |