|
I have an SPSS 14.0 data file comprised of 31,086 cases. Each case
contains a 2-character state code in a string variable named ZBRegion. I want to reduce the number of cases from a given state. For example, I currently have 335 cases from AZ. I'd like to randomly select 127 of those AZ cases, cut them from my original data file, and paste them into a file entitled ReductionCuts. Can anyone propose a macro that will accomplish my objective? I would like to be able to modify the code to make cuts from additional states and deposit those cases in the file entitled ReductionCuts. Thanks for your help. It's greatly appreciated. ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Jim,
Not sure how to do it in syntax, but this website gives a step-by-step approach to doing it the point-and-click way. http://ssc.utexas.edu/consulting/answers/spss/spss17.html Hope all is well! April ----- Original Message ---- From: Jim Moffitt <[hidden email]> To: [hidden email] Sent: Monday, March 10, 2008 10:50:51 AM Subject: Random Cuts I have an SPSS 14.0 data file comprised of 31,086 cases. Each case contains a 2-character state code in a string variable named ZBRegion. I want to reduce the number of cases from a given state. For example, I currently have 335 cases from AZ. I'd like to randomly select 127 of those AZ cases, cut them from my original data file, and paste them into a file entitled ReductionCuts. Can anyone propose a macro that will accomplish my objective? I would like to be able to modify the code to make cuts from additional states and deposit those cases in the file entitled ReductionCuts. Thanks for your help. It's greatly appreciated. ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Drawing a sample like this is one of the things the Complex Samples module makes easy. It is focused on selecting cases rather than discarding them, but that is basically the same problem.
That option also accounts for the sampling scheme in analyzing the data, in case you are planning to do that. HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of A Seifert Sent: Monday, March 10, 2008 1:06 PM To: [hidden email] Subject: Re: [SPSSX-L] Random Cuts Hi Jim, Not sure how to do it in syntax, but this website gives a step-by-step approach to doing it the point-and-click way. http://ssc.utexas.edu/consulting/answers/spss/spss17.html Hope all is well! April ----- Original Message ---- From: Jim Moffitt <[hidden email]> To: [hidden email] Sent: Monday, March 10, 2008 10:50:51 AM Subject: Random Cuts I have an SPSS 14.0 data file comprised of 31,086 cases. Each case contains a 2-character state code in a string variable named ZBRegion. I want to reduce the number of cases from a given state. For example, I currently have 335 cases from AZ. I'd like to randomly select 127 of those AZ cases, cut them from my original data file, and paste them into a file entitled ReductionCuts. Can anyone propose a macro that will accomplish my objective? I would like to be able to modify the code to make cuts from additional states and deposit those cases in the file entitled ReductionCuts. Thanks for your help. It's greatly appreciated. ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jim Moffitt
Jim,
There are a number of ways to do this, and you don't need macros or the Complex Samples module or Python to do it (although Python could help, depending on what you want to do). So you have cases representing 50 states and you want to be able to randomly reduce the number of cases, which will vary by state, by writing a random selection to another file? When you say, "paste them into a file...," do you mean append them to an existing file? Is the existing file an SPSS .sav file? And some states would not have the number of cases reduced? And this would vary from time to time? King Douglas American Airlines Customer Research Jim Moffitt <[hidden email]> wrote: I have an SPSS 14.0 data file comprised of 31,086 cases. Each case contains a 2-character state code in a string variable named ZBRegion. I want to reduce the number of cases from a given state. For example, I currently have 335 cases from AZ. I'd like to randomly select 127 of those AZ cases, cut them from my original data file, and paste them into a file entitled ReductionCuts. Can anyone propose a macro that will accomplish my objective? I would like to be able to modify the code to make cuts from additional states and deposit those cases in the file entitled ReductionCuts. Thanks for your help. It's greatly appreciated. ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jim Moffitt
At 01:50 PM 3/10/2008, Jim Moffitt wrote:
>I have an SPSS 14 data file. Each case contains a 2-character state >code in a string variable named ZBRegion. I want to reduce the >number of cases from a given state. For example, I currently have >335 cases from AZ. I'd like to randomly select 127 of those AZ >cases, cut them from my original data file, and paste them into a >file entitled ReductionCuts. The question is, how many do you want to keep from each state? If you want to keep a fixed number, it's easy. If you want to keep a fixed percent, it's slightly more difficult. Anyway, here's logic that keeps 127 from each state. It's not tested, but it's taken from previously-posted code that was tested(*). It uses what's called the 'k/n' algorithm -- you'll see why, in the code: DATASET COPY ReductionCuts. DATASET ACTIVATE ReductionCuts WINDOW=FRONT. SORT CASES BY ZBRegion /* if necessary */. * Set random-number generator parameters, if desired . SET RNG = MT /* 'Mersenne twister' random-no. generator */ . SET MTINDEX = 7778 /* or other starting value - anything */ . AGGREGATE OUTFILE=* MODE=ADDVARIABLES /BREAK=ZBRegion /NRecords 'Number of records for state'=NU. NUMERIC #K #N (F3). DO IF $CASENUM EQ 1 OR ID NE LAG(ID). . COMPUTE #N = NRecords /* Total records, per state */. . COMPUTE #K = 127 /* Set sample size here */. END IF. COMPUTE #Take_It = RV.BERNOULLI(#K/#N). COMPUTE #K = #K - #Take_It. COMPUTE #N = #N - 1. SELECT IF #Take_It. ........................ (*)Date: Wed, 16 Jan 2008 03:34:19 -0500 From: Richard Ristow <[hidden email]> Subject: Re: drawing samples for hundreds of workers To: [hidden email] X-ELNK-Received-Info: spv=0; X-ELNK-AV: 0 X-ELNK-Info: sbv=0; sbrc=.0; sbf=0b; sbw=000; -- No virus found in this outgoing message. Checked by AVG. Version: 7.5.518 / Virus Database: 269.21.7/1324 - Release Date: 3/10/2008 7:27 PM ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
