Random Sample by Counselor

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Random Sample by Counselor

Allen Frommelt
I need to create a random sample for our clinical audits.  I have a database of participant id, and health counselor.  I need to create a 25% random sample by counselor.  Is there a way to do this in SPSS?  Thanks!

Healthy Regards,

R. Allen Frommelt
Director, Population Analysis & Reporting

Nurtur
20 Batterson Park Road
Farmington, CT  06032

860 676 3620
860 678 1600 fax
800 293 0056 toll-free

[hidden email]  (please note new email address)
www.nurturhealth.com




This email and all attachments are confidential and intended solely
for the use of the individual or entity to which they are addressed.
If you have received this email in error please notify the sender
by replying to this message. If you are not the intended recipient,
please delete this message and all attachments immediately.  Do not
copy, disclose, use or act upon the information contained. Please
note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the
company. Finally, the recipient should check this email and any
attachments for the presence of viruses. While every attempt is made
to verify that the contents are safe, the company accepts no liability
for any damage caused by any virus transmitted by this email.

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Sample by Counselor

Richard Ristow
At 02:57 PM 5/20/2008, Allen Frommelt wrote:

>I have a database of participant id, and health counselor.  I need
>to create a 25% random sample by counselor.  Is there a way to do
>this in SPSS?  Thanks!

All the participants for 25% of the counselors, or 25% of the
participants for all counselors?

If you want all participants for 25% of the counselors, you get a
list of the counselors, sample 25% of them, and merge back with the
original file. As for the sampling, use any method you please; though
the SAMPLE command requires you to hard-code the exact number of
counselors, if you want a sample as near as possible to an exact 25%.

The following code is tested, but this is simply the code, not a
listing. (The LIST commands should be removed for production use.) It
uses dataset logic (SPSS 14 and later), and assumes that the data is
in an active dataset named PartiList.


DATASET DECLARE  CounsList.

AGGREGATE OUTFILE=CounsList
    /BREAK=Counselor
    /CaseLoad 'No. of clients for counselor' = NU.

DATASET ACTIVATE CounsList WINDOW=FRONT.

*  Sample 25% of the counselors by the 'K/N' metnod:                 .

COMPUTE NOBREAK = 1.
DATASET DECLARE  CounsCount.
AGGREGATE OUTFILE=CounsCount
    /BREAK=NOBREAK
    /N 'Number of counselors' = NU.
DATASET ACTIVATE CounsCount WINDOW=FRONT.

NUMERIC   K (F3).
VAR LABEL K 'Counselors to sample'.
COMPUTE K = RND(0.25*N).
FORMATS N K (F3).

DATASET ACTIVATE CounsList WINDOW=FRONT.
MATCH FILES
        /FILE =*
        /TABLE=CounsCount
        /BY   NOBREAK.

DO IF $CASENUM EQ 1.
.  COMPUTE #K = K.
.  COMPUTE #N = N.
END IF.

NUMERIC    InSample (F2).
VAR LABELS InSample 'Indicator: Counselor is in sample'.

COMPUTE    InSample = RV.BERNOULLI(#K/#N).
COMPUTE    #K       = #K - InSample.
COMPUTE    #N       = #N - 1.

LIST      Counselor InSample.

DATASET ACTIVATE PartiList WINDOW=FRONT.

*  Attach 'Sampled' flag to participant records                      .
MATCH FILES
    /FILE =PartiList
    /TABLE=CounsList
    /BY   Counselor
    /DROP = NOBREAK K N CaseLoad.


.  /**/ LIST /*-*/.
SELECT IF InSample.
============================
APPENDIX: Test data and code
============================
*  ................................................................. .
*  .................   Test data               ..................... .
SET RNG = MT       /* 'Mersenne twister' random number generator  */ .
SET MTINDEX = 9518 /*  A phone number in Maryland                 */ .

INPUT PROGRAM.
.  NUMERIC Counselor  (N3)
            Participant(F5).
.  LEAVE   Counselor.
.  LOOP    #I_Couns = 1 TO 12.
.     COMPUTE Counselor = TRUNC(RV.UNIFORM(100,1000)).
.     COMPUTE #N_Client = RV.POISSON(5).
.     LOOP    #I_Client = 1 TO #N_Client.
.        COMPUTE Participant = TRUNC(RV.UNIFORM(1E4,1E5)).
.        END CASE.
.     END LOOP.
.  END LOOP.
END FILE.
END INPUT PROGRAM.
SORT CASES BY Counselor Participant.
DATASET NAME     PartiList WINDOW=FRONT.
LIST.


*  .................   Post after this point   ..................... .
*  ................................................................. .

DATASET DECLARE  CounsList.

AGGREGATE OUTFILE=CounsList
    /BREAK=Counselor
    /CaseLoad 'No. of clients for counselor' = NU.

DATASET ACTIVATE CounsList WINDOW=FRONT.

*  .................   Post after this point   ..................... .
*  Sample 25% of the counselors by the 'K/N' metnod:                 .

COMPUTE NOBREAK = 1.
DATASET DECLARE  CounsCount.
AGGREGATE OUTFILE=CounsCount
    /BREAK=NOBREAK
    /N 'Number of counselors' = NU.
DATASET ACTIVATE CounsCount WINDOW=FRONT.

NUMERIC   K (F3).
VAR LABEL K 'Counselors to sample'.
COMPUTE K = RND(0.25*N).
FORMATS N K (F3).

DATASET ACTIVATE CounsList WINDOW=FRONT.
MATCH FILES
        /FILE =*
        /TABLE=CounsCount
        /BY   NOBREAK.

DO IF $CASENUM EQ 1.
.  COMPUTE #K = K.
.  COMPUTE #N = N.
END IF.

NUMERIC    InSample (F2).
VAR LABELS InSample 'Indicator: Counselor is in sample'.

COMPUTE    InSample = RV.BERNOULLI(#K/#N).
COMPUTE    #K       = #K - InSample.
COMPUTE    #N       = #N - 1.

LIST      Counselor InSample.

DATASET ACTIVATE PartiList WINDOW=FRONT.

*  Attach 'Sampled' flag to participant records                      .
MATCH FILES
    /FILE =PartiList
    /TABLE=CounsList
    /BY   Counselor
    /DROP = NOBREAK K N CaseLoad.


.  /**/ LIST /*-*/.
SELECT IF InSample.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD