Random Cuts

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Random Cuts

Jim Moffitt
I have an SPSS 14.0 data file comprised of 31,086 cases. Each case
contains a 2-character state code in a string variable named ZBRegion. I
want to reduce the number of cases from a given state. For example, I
currently have 335 cases from AZ. I'd like to randomly select 127 of
those AZ cases, cut them from my original data file, and paste them into
a file entitled ReductionCuts.

 

Can anyone propose a macro that will accomplish my objective? I would
like to be able to modify the code to make cuts from additional states
and deposit those cases in the file entitled ReductionCuts.

 

Thanks for your help. It's greatly appreciated.

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Cuts

A Seifert-2
Hi Jim,

Not sure how to do it in syntax, but this website gives a step-by-step approach to doing it the point-and-click way.

http://ssc.utexas.edu/consulting/answers/spss/spss17.html

Hope all is well!
April

----- Original Message ----
From: Jim Moffitt <[hidden email]>
To: [hidden email]
Sent: Monday, March 10, 2008 10:50:51 AM
Subject: Random Cuts

I have an SPSS 14.0 data file comprised of 31,086 cases. Each case
contains a 2-character state code in a string variable named ZBRegion. I
want to reduce the number of cases from a given state. For example, I
currently have 335 cases from AZ. I'd like to randomly select 127 of
those AZ cases, cut them from my original data file, and paste them into
a file entitled ReductionCuts.



Can anyone propose a macro that will accomplish my objective? I would
like to be able to modify the code to make cuts from additional states
and deposit those cases in the file entitled ReductionCuts.



Thanks for your help. It's greatly appreciated.

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD






      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Cuts

Peck, Jon
Drawing a sample like this is one of the things the Complex Samples module makes easy.  It is focused on selecting cases rather than discarding them, but that is basically the same problem.

That option also accounts for the sampling scheme in analyzing the data, in case you are planning to do that.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of A Seifert
Sent: Monday, March 10, 2008 1:06 PM
To: [hidden email]
Subject: Re: [SPSSX-L] Random Cuts

Hi Jim,

Not sure how to do it in syntax, but this website gives a step-by-step approach to doing it the point-and-click way.

http://ssc.utexas.edu/consulting/answers/spss/spss17.html

Hope all is well!
April

----- Original Message ----
From: Jim Moffitt <[hidden email]>
To: [hidden email]
Sent: Monday, March 10, 2008 10:50:51 AM
Subject: Random Cuts

I have an SPSS 14.0 data file comprised of 31,086 cases. Each case
contains a 2-character state code in a string variable named ZBRegion. I
want to reduce the number of cases from a given state. For example, I
currently have 335 cases from AZ. I'd like to randomly select 127 of
those AZ cases, cut them from my original data file, and paste them into
a file entitled ReductionCuts.



Can anyone propose a macro that will accomplish my objective? I would
like to be able to modify the code to make cuts from additional states
and deposit those cases in the file entitled ReductionCuts.



Thanks for your help. It's greatly appreciated.

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD






      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Cuts

King Douglas
In reply to this post by Jim Moffitt
Jim,

There are a number of ways to do this, and you don't need macros or the Complex Samples module or Python to do it (although Python could help, depending on what you want to do).

So you have cases representing 50 states and you want to be able to randomly reduce the number of cases, which will vary by state, by writing a random selection to another file?  When you say, "paste them into a file...," do you mean append them to an existing file?  Is the existing file an SPSS .sav file?

And some states would not have the number of cases reduced?

And this would vary from time to time?

King Douglas
American Airlines Customer Research





Jim Moffitt <[hidden email]> wrote: I have an SPSS 14.0 data file comprised of 31,086 cases. Each case
contains a 2-character state code in a string variable named ZBRegion. I
want to reduce the number of cases from a given state. For example, I
currently have 335 cases from AZ. I'd like to randomly select 127 of
those AZ cases, cut them from my original data file, and paste them into
a file entitled ReductionCuts.



Can anyone propose a macro that will accomplish my objective? I would
like to be able to modify the code to make cuts from additional states
and deposit those cases in the file entitled ReductionCuts.



Thanks for your help. It's greatly appreciated.

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Cuts

Richard Ristow
In reply to this post by Jim Moffitt
At 01:50 PM 3/10/2008, Jim Moffitt wrote:

>I have an SPSS 14 data file. Each case contains a 2-character state
>code in a string variable named ZBRegion. I want to reduce the
>number of cases from a given state. For example, I currently have
>335 cases from AZ. I'd like to randomly select 127 of those AZ
>cases, cut them from my original data file, and paste them into a
>file entitled ReductionCuts.

The question is, how many do you want to keep from each state? If you
want to keep a fixed number, it's easy. If you want to keep a fixed
percent, it's slightly more difficult. Anyway, here's logic that
keeps 127 from each state. It's not tested, but it's taken from
previously-posted code that was tested(*). It uses what's called the
'k/n' algorithm -- you'll see why, in the code:

DATASET COPY     ReductionCuts.
DATASET ACTIVATE ReductionCuts WINDOW=FRONT.

SORT CASES BY ZBRegion /* if necessary */.

*  Set random-number generator parameters, if desired   .
SET RNG = MT       /* 'Mersenne twister' random-no. generator */ .
SET MTINDEX = 7778 /* or other starting value - anything      */ .

AGGREGATE OUTFILE=* MODE=ADDVARIABLES
    /BREAK=ZBRegion
    /NRecords 'Number of records for state'=NU.

NUMERIC   #K #N (F3).

DO IF   $CASENUM EQ 1
      OR ID       NE LAG(ID).
.  COMPUTE #N = NRecords  /* Total records,    per state */.
.  COMPUTE #K = 127       /* Set sample size here        */.
END IF.

COMPUTE #Take_It = RV.BERNOULLI(#K/#N).
COMPUTE #K = #K - #Take_It.
COMPUTE #N = #N - 1.

SELECT IF #Take_It.

........................
(*)Date:  Wed, 16 Jan 2008 03:34:19 -0500
From:     Richard Ristow <[hidden email]>
Subject:  Re: drawing samples for hundreds of workers
To:       [hidden email]
X-ELNK-Received-Info: spv=0;
X-ELNK-AV: 0
X-ELNK-Info: sbv=0; sbrc=.0; sbf=0b; sbw=000;



--
No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.518 / Virus Database: 269.21.7/1324 - Release Date: 3/10/2008 7:27 PM

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD