|
I need to create a random sample for our clinical audits. I have a database of participant id, and health counselor. I need to create a 25% random sample by counselor. Is there a way to do this in SPSS? Thanks!
Healthy Regards, R. Allen Frommelt Director, Population Analysis & Reporting Nurtur 20 Batterson Park Road Farmington, CT 06032 860 676 3620 860 678 1600 fax 800 293 0056 toll-free [hidden email] (please note new email address) www.nurturhealth.com This email and all attachments are confidential and intended solely for the use of the individual or entity to which they are addressed. If you have received this email in error please notify the sender by replying to this message. If you are not the intended recipient, please delete this message and all attachments immediately. Do not copy, disclose, use or act upon the information contained. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Finally, the recipient should check this email and any attachments for the presence of viruses. While every attempt is made to verify that the contents are safe, the company accepts no liability for any damage caused by any virus transmitted by this email. ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
At 02:57 PM 5/20/2008, Allen Frommelt wrote:
>I have a database of participant id, and health counselor. I need >to create a 25% random sample by counselor. Is there a way to do >this in SPSS? Thanks! All the participants for 25% of the counselors, or 25% of the participants for all counselors? If you want all participants for 25% of the counselors, you get a list of the counselors, sample 25% of them, and merge back with the original file. As for the sampling, use any method you please; though the SAMPLE command requires you to hard-code the exact number of counselors, if you want a sample as near as possible to an exact 25%. The following code is tested, but this is simply the code, not a listing. (The LIST commands should be removed for production use.) It uses dataset logic (SPSS 14 and later), and assumes that the data is in an active dataset named PartiList. DATASET DECLARE CounsList. AGGREGATE OUTFILE=CounsList /BREAK=Counselor /CaseLoad 'No. of clients for counselor' = NU. DATASET ACTIVATE CounsList WINDOW=FRONT. * Sample 25% of the counselors by the 'K/N' metnod: . COMPUTE NOBREAK = 1. DATASET DECLARE CounsCount. AGGREGATE OUTFILE=CounsCount /BREAK=NOBREAK /N 'Number of counselors' = NU. DATASET ACTIVATE CounsCount WINDOW=FRONT. NUMERIC K (F3). VAR LABEL K 'Counselors to sample'. COMPUTE K = RND(0.25*N). FORMATS N K (F3). DATASET ACTIVATE CounsList WINDOW=FRONT. MATCH FILES /FILE =* /TABLE=CounsCount /BY NOBREAK. DO IF $CASENUM EQ 1. . COMPUTE #K = K. . COMPUTE #N = N. END IF. NUMERIC InSample (F2). VAR LABELS InSample 'Indicator: Counselor is in sample'. COMPUTE InSample = RV.BERNOULLI(#K/#N). COMPUTE #K = #K - InSample. COMPUTE #N = #N - 1. LIST Counselor InSample. DATASET ACTIVATE PartiList WINDOW=FRONT. * Attach 'Sampled' flag to participant records . MATCH FILES /FILE =PartiList /TABLE=CounsList /BY Counselor /DROP = NOBREAK K N CaseLoad. . /**/ LIST /*-*/. SELECT IF InSample. ============================ APPENDIX: Test data and code ============================ * ................................................................. . * ................. Test data ..................... . SET RNG = MT /* 'Mersenne twister' random number generator */ . SET MTINDEX = 9518 /* A phone number in Maryland */ . INPUT PROGRAM. . NUMERIC Counselor (N3) Participant(F5). . LEAVE Counselor. . LOOP #I_Couns = 1 TO 12. . COMPUTE Counselor = TRUNC(RV.UNIFORM(100,1000)). . COMPUTE #N_Client = RV.POISSON(5). . LOOP #I_Client = 1 TO #N_Client. . COMPUTE Participant = TRUNC(RV.UNIFORM(1E4,1E5)). . END CASE. . END LOOP. . END LOOP. END FILE. END INPUT PROGRAM. SORT CASES BY Counselor Participant. DATASET NAME PartiList WINDOW=FRONT. LIST. * ................. Post after this point ..................... . * ................................................................. . DATASET DECLARE CounsList. AGGREGATE OUTFILE=CounsList /BREAK=Counselor /CaseLoad 'No. of clients for counselor' = NU. DATASET ACTIVATE CounsList WINDOW=FRONT. * ................. Post after this point ..................... . * Sample 25% of the counselors by the 'K/N' metnod: . COMPUTE NOBREAK = 1. DATASET DECLARE CounsCount. AGGREGATE OUTFILE=CounsCount /BREAK=NOBREAK /N 'Number of counselors' = NU. DATASET ACTIVATE CounsCount WINDOW=FRONT. NUMERIC K (F3). VAR LABEL K 'Counselors to sample'. COMPUTE K = RND(0.25*N). FORMATS N K (F3). DATASET ACTIVATE CounsList WINDOW=FRONT. MATCH FILES /FILE =* /TABLE=CounsCount /BY NOBREAK. DO IF $CASENUM EQ 1. . COMPUTE #K = K. . COMPUTE #N = N. END IF. NUMERIC InSample (F2). VAR LABELS InSample 'Indicator: Counselor is in sample'. COMPUTE InSample = RV.BERNOULLI(#K/#N). COMPUTE #K = #K - InSample. COMPUTE #N = #N - 1. LIST Counselor InSample. DATASET ACTIVATE PartiList WINDOW=FRONT. * Attach 'Sampled' flag to participant records . MATCH FILES /FILE =PartiList /TABLE=CounsList /BY Counselor /DROP = NOBREAK K N CaseLoad. . /**/ LIST /*-*/. SELECT IF InSample. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
