How would one write the syntax to quickly create a 300-case dataset
containing an ID variable where each case has a unique ID ranging from 1 to 300 and a second variable named VBX for which each case receives a random value ranging from 1 to 25? I found some code written by Michael Roberts (see below) that will create the id variable, but how do I create the second variable (VBX)? Here's Michael's code: INPUT PROGRAM. LOOP #I=1 TO 300. COMPUTE id=#I. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. Thanks. |
You can add other commands after compute #id, and the loop will cycle
through them as well. ** tested. set seed = 2007020511. INPUT PROGRAM. LOOP #I=1 TO 300 . COMPUTE id=#I. COMPUTE vbx =mod( rnd(uniform(1)*100),25)+1. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. I used the uniform distribution rounded to a whole number, then took the modulus (25) of that. (The set seed allows you to reproduce the results. --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jim Moffitt Sent: Monday, February 05, 2007 4:55 PM To: [hidden email] Subject: Quickly Create a Dataset Consisting of Two Variables How would one write the syntax to quickly create a 300-case dataset containing an ID variable where each case has a unique ID ranging from 1 to 300 and a second variable named VBX for which each case receives a random value ranging from 1 to 25? I found some code written by Michael Roberts (see below) that will create the id variable, but how do I create the second variable (VBX)? Here's Michael's code: INPUT PROGRAM. LOOP #I=1 TO 300. COMPUTE id=#I. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. Thanks. |
In reply to this post by Jim Moffitt
INPUT PROGRAM.
LOOP id=1 TO 300. COMPUTE vbx = rv.unifrom(1,25). COMPUTE IVBX =RND(VBX). END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0) vbx (f7.4) IVBX(F2). FREQUENCIES VARS= IVBX. Is this what you need? Art Kendall Social Research Consultants Jim Moffitt wrote: > How would one write the syntax to quickly create a 300-case dataset > containing an ID variable where each case has a unique ID ranging from 1 > to 300 and a second variable named VBX for which each case receives a > random value ranging from 1 to 25? > > I found some code written by Michael Roberts (see below) that will > create the id variable, but how do I create the second variable (VBX)? > > Here's Michael's code: > > INPUT PROGRAM. > LOOP #I=1 TO 300. > COMPUTE id=#I. > END CASE. > END LOOP. > END FILE. > END INPUT PROGRAM. > FORMATS id (F3.0). > EXECUTE. > > Thanks. > > > |
In reply to this post by Jim Moffitt
There are a variety of random number functions, for example:
compute vbx=rv.uniform(1,25). will create continuous random uniform values between 1 and 25. If you want to restrict the values to integers: compute vbx=rnd(uniform(1,25)). You can include this command in the input program (before the End Case command) or after the input program. For more information on random number functions, search for "random number functions" or "random variable functions" in the help. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jim Moffitt Sent: Monday, February 05, 2007 4:55 PM To: [hidden email] Subject: Quickly Create a Dataset Consisting of Two Variables How would one write the syntax to quickly create a 300-case dataset containing an ID variable where each case has a unique ID ranging from 1 to 300 and a second variable named VBX for which each case receives a random value ranging from 1 to 25? I found some code written by Michael Roberts (see below) that will create the id variable, but how do I create the second variable (VBX)? Here's Michael's code: INPUT PROGRAM. LOOP #I=1 TO 300. COMPUTE id=#I. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. Thanks. |
In reply to this post by Marks, Jim
I think compute vbx=rnd(rv.uniform(1,25)) will produce the same result with slightly less code.
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marks, Jim Sent: Monday, February 05, 2007 5:10 PM To: [hidden email] Subject: Re: Quickly Create a Dataset Consisting of Two Variables You can add other commands after compute #id, and the loop will cycle through them as well. ** tested. set seed = 2007020511. INPUT PROGRAM. LOOP #I=1 TO 300 . COMPUTE id=#I. COMPUTE vbx =mod( rnd(uniform(1)*100),25)+1. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. I used the uniform distribution rounded to a whole number, then took the modulus (25) of that. (The set seed allows you to reproduce the results. --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jim Moffitt Sent: Monday, February 05, 2007 4:55 PM To: [hidden email] Subject: Quickly Create a Dataset Consisting of Two Variables How would one write the syntax to quickly create a 300-case dataset containing an ID variable where each case has a unique ID ranging from 1 to 300 and a second variable named VBX for which each case receives a random value ranging from 1 to 25? I found some code written by Michael Roberts (see below) that will create the id variable, but how do I create the second variable (VBX)? Here's Michael's code: INPUT PROGRAM. LOOP #I=1 TO 300. COMPUTE id=#I. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. Thanks. |
In reply to this post by Art Kendall-2
Thanks, Art this will do the trick.
-----Original Message----- From: Art Kendall [mailto:[hidden email]] Sent: Monday, February 05, 2007 5:14 PM To: Moffitt, James (West) Cc: [hidden email] Subject: Re: Quickly Create a Dataset Consisting of Two Variables INPUT PROGRAM. LOOP id=1 TO 300. COMPUTE vbx = rv.unifrom(1,25). COMPUTE IVBX =RND(VBX). END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0) vbx (f7.4) IVBX(F2). FREQUENCIES VARS= IVBX. Is this what you need? Art Kendall Social Research Consultants Jim Moffitt wrote: > How would one write the syntax to quickly create a 300-case dataset > containing an ID variable where each case has a unique ID ranging from > 1 to 300 and a second variable named VBX for which each case receives > a random value ranging from 1 to 25? > > I found some code written by Michael Roberts (see below) that will > create the id variable, but how do I create the second variable (VBX)? > > Here's Michael's code: > > INPUT PROGRAM. > LOOP #I=1 TO 300. > COMPUTE id=#I. > END CASE. > END LOOP. > END FILE. > END INPUT PROGRAM. > FORMATS id (F3.0). > EXECUTE. > > Thanks. > > > |
In reply to this post by Marks, Jim
Thanks, Jim. This works fine.
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marks, Jim Sent: Monday, February 05, 2007 5:10 PM To: [hidden email] Subject: Re: Quickly Create a Dataset Consisting of Two Variables You can add other commands after compute #id, and the loop will cycle through them as well. ** tested. set seed = 2007020511. INPUT PROGRAM. LOOP #I=1 TO 300 . COMPUTE id=#I. COMPUTE vbx =mod( rnd(uniform(1)*100),25)+1. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. I used the uniform distribution rounded to a whole number, then took the modulus (25) of that. (The set seed allows you to reproduce the results. --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jim Moffitt Sent: Monday, February 05, 2007 4:55 PM To: [hidden email] Subject: Quickly Create a Dataset Consisting of Two Variables How would one write the syntax to quickly create a 300-case dataset containing an ID variable where each case has a unique ID ranging from 1 to 300 and a second variable named VBX for which each case receives a random value ranging from 1 to 25? I found some code written by Michael Roberts (see below) that will create the id variable, but how do I create the second variable (VBX)? Here's Michael's code: INPUT PROGRAM. LOOP #I=1 TO 300. COMPUTE id=#I. END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0). EXECUTE. Thanks. |
In reply to this post by Jim Moffitt
Colleagues
I am struggling to combine IF and COUNT. My objective is to create a separate new variable to record the frequency of a score as 0, 1 or 2 that may occur across 30 existing variables. The syntax I tried fails at the first line and no rewrites this morning have overcome the problem. IF ( A01_ROUND THRU A30_ROUND = 0) A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (0) . IF ( A01_ROUND THRU A30_ROUND = 1) A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (1) . IF ( A01_ROUND THRU A30_ROUND = 2) A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (2) . EXECUTE . Suggestions greatly appreciated. Warm regards/gary |
Garry,
I'd do (not tested): Do IF ( A01_ROUND THRU A30_ROUND = 0. count A_ROUND_RESPONSEO1 = A01_ROUND THRU A30_ROUND (0) . end if. Do IF ( A01_ROUND THRU A30_ROUND = 1). count A_ROUND_RESPONSEO1 = A01_ROUND THRU A30_ROUND (1). End if. Do IF ( A01_ROUND THRU A30_ROUND = 2). count A_ROUND_RESPONSEO1 = A01_ROUND THRU A30_ROUND (2). End if. HTH, Judith -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gary Oliver Sent: Tuesday, 6 February 2007 11:32 To: [hidden email] Subject: If and Count syntax Colleagues I am struggling to combine IF and COUNT. My objective is to create a separate new variable to record the frequency of a score as 0, 1 or 2 that may occur across 30 existing variables. The syntax I tried fails at the first line and no rewrites this morning have overcome the problem. IF ( A01_ROUND THRU A30_ROUND = 0) A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (0) . IF ( A01_ROUND THRU A30_ROUND = 1) A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (1) . IF ( A01_ROUND THRU A30_ROUND = 2) A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (2) . EXECUTE . Suggestions greatly appreciated. Warm regards/gary |
In reply to this post by Gary Oliver
At 08:01 PM 2/5/2007, Gary Oliver wrote:
>The syntax I tried [reformatted, below] fails at the first line and no >rewrites this morning have overcome the problem. >IF ( A01_ROUND THRU A30_ROUND = 0) > A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (0) . >IF ( A01_ROUND THRU A30_ROUND = 1) > A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (1) . >IF ( A01_ROUND THRU A30_ROUND = 2) > A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (2) . Skipping the syntax for a moment, I'm not sure what you do want. You write, >To create a to record the frequency of a score as 0, 1 or 2 that may >occur across 30 existing variables. If you mean you want, separately, the number of 0's, 1's, and 2's in the list of variables, then (a) you need the results in three variables, not the one variable A_ROUND_RESPONSEO1; and (b) you don't need IF at all. Something like this (not tested); note each COMPUTE calculates a different variable: COMPUTE A_ROUND_RESPONSEO0 = COUNT A01_ROUND THRU A30_ROUND (0) . COMPUTE A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (1) . COMPUTE A_ROUND_RESPONSEO2 = COUNT A01_ROUND THRU A30_ROUND (2) . Your syntax problem is, I think, the expressions like "A01_ROUND THRU A30_ROUND = 0". That isn't valid, but you've found that out. But if the above isn't what you wanted (so you don't need IF at all), I'm at a loss what you mean, so I can't say how to fix it. It looks like you're trying to say "all of the 30 variables are 0" or "any one of the 30 variables is 0". For both of these, see below. But I don't see your computations making sense in either case. If you mean "all the variables are 0", etc., variable A_ROUND_RESPONSEO1 will be 30 if all 30 variables have the same one of the values 0, 1, or 2, and will be unchanged (probably SYSMIS) otherwise. If you mean "any of the variables is 0", etc., variable A_ROUND_RESPONSEO1 will be the number of 0's, 1's, or 2's in your list of 30 variables; but it will be the count of the last (largest) of these values which occurs at allk with no indication whether it's 0's, 1's, or 2's that are counted. .............................. APPENDIX (code not tested): Test that *all* of A01_ROUND THRU A30_ROUND are 1: IF ( MIN(A01_ROUND THRU A30_ROUND) EQ 1 AND MIN(A01_ROUND THRU A30_ROUND) EQ 1) (Thanks, Jan Spousta.) Be careful: this will test 'true' if any of the 30 variables are missing, as long as all the non-missing ones have value 0. If you want the test to be 'false' if there missing values, change to IF ( MIN (A01_ROUND THRU A30_ROUND) EQ 1 AND MIN (A01_ROUND THRU A30_ROUND) EQ 1 AND NMISS(A01_ROUND THRU A30_ROUND) EQ 0) Test that *any* of A01_ROUND THRU A30_ROUND is 1: IF ANY(1,A01_ROUND THRU A30_ROUND) |
COUNT freq0 = A01_ROUND THRU A30_ROUND (0).
COUNT freq1 = A01_ROUND THRU A30_ROUND (1). COUNT freq2 = A01_ROUND THRU A30_ROUND (2). | |>The syntax I tried [reformatted, below] fails at the first |line and no |>rewrites this morning have overcome the problem. IF ( A01_ROUND THRU |>A30_ROUND = 0) |> A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (0) . IF ( |>A01_ROUND THRU A30_ROUND = 1) |> A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (1) . IF ( |>A01_ROUND THRU A30_ROUND = 2) |> A_ROUND_RESPONSEO1 = COUNT A01_ROUND THRU A30_ROUND (2) . | |
In reply to this post by Jim Moffitt
At 05:54 PM 2/5/2007, Jim Moffitt wrote:
>How would one create a 300-case dataset where each case has a unique >ID Ranging from 1 to 300 and a second variable named VBX which >receives a random value ranging from 1 to 25? Responders solved this as a programming problem, adding the second variable in the INPUT PROGRAM. The following computations were suggested by different posters for calculating the "random values ranging from 1 to 25." All posters assumed integers were wanted, which I accept as well. "Random ... ranging from 1 to 25" in principle allows any distribution, but is commonly taken to mean equi-distribution, that all values are equally likely. One suggested calculation doesn't meet this criterion: . compute vbx=rnd(RV.uniform(1,25)). Values 1 and 25 are only half as likely to be selected as are the other 23 values. Another has, I think, the same problem to lesser degree: . COMPUTE vbx =mod( rnd(uniform(1)*100),25)+1. That's the above logic, but applied to the range 1-100 and then 'wrapped', using MOD function, down to 1-25. Each value in range 1-25 corresponds to four values in range 1-100, and only two of those (1 and 100) have low probability of being selected; so the effect is diluted. I recommend, as being simple and giving the desired equi-distribution, . COMPUTE VBX = TRUNC(RV.UNIFORM(1,26)). The following is SPSSX 14 draft output, with some hand editing: NEW FILE. INPUT PROGRAM. . COMPUTE #N_CASES = 10000. . NUMERIC ID (F5) /VBX_MOD VBX_RND VBX_TRNC (F3). . LOOP ID = 1 TO #N_CASES. . COMPUTE VBX_MOD = mod( rnd(uniform(1)*100),25)+1. . COMPUTE VBX_RND = rnd(RV.uniform(1,25)). . COMPUTE VBX_TRNC = TRUNC(RV.UNIFORM(1,26)). . END CASE. . END LOOP. END FILE. END INPUT PROGRAM. FREQUENCIES VBX_MOD VBX_RND VBX_TRNC. Frequencies |--------------------------|------------------------| |Output Created |06-FEB-2007 01:49:34 | |--------------------------|------------------------| Statistics [suppressed - no missing data] Frequency Table VBX_MOD |-----|-----|---------|-------|-------------|---------------| | | |Frequency|Percent|Valid Percent|Cumulative | | | | | | |Percent | |-----|-----|---------|-------|-------------|---------------| |Valid|1 |385 |3.9 |3.9 |3.9 | | |2 |382 |3.8 |3.8 |7.7 | | |3 |416 |4.2 |4.2 |11.8 | | |4 |377 |3.8 |3.8 |15.6 | | |5 |401 |4.0 |4.0 |19.6 | | |-----|---------|-------|-------------|---------------| | |6 |393 |3.9 |3.9 |23.5 | | |7 |424 |4.2 |4.2 |27.8 | | |8 |404 |4.0 |4.0 |31.8 | | |9 |398 |4.0 |4.0 |35.8 | | |10 |420 |4.2 |4.2 |40.0 | | |-----|---------|-------|-------------|---------------| | |11 |375 |3.8 |3.8 |43.8 | | |12 |450 |4.5 |4.5 |48.3 | | |13 |428 |4.3 |4.3 |52.5 | | |14 |396 |4.0 |4.0 |56.5 | | |15 |364 |3.6 |3.6 |60.1 | | |-----|---------|-------|-------------|---------------| | |16 |412 |4.1 |4.1 |64.3 | | |17 |402 |4.0 |4.0 |68.3 | | |18 |381 |3.8 |3.8 |72.1 | | |19 |419 |4.2 |4.2 |76.3 | | |20 |376 |3.8 |3.8 |80.0 | | |-----|---------|-------|-------------|---------------| | |21 |397 |4.0 |4.0 |84.0 | | |22 |411 |4.1 |4.1 |88.1 | | |23 |420 |4.2 |4.2 |92.3 | | |24 |353 |3.5 |3.5 |95.8 | | |25 |416 |4.2 |4.2 |100.0 | | |-----|---------|-------|-------------|---------------| | |Total|10000 |100.0 |100.0 | | |-----|-----|---------|-------|-------------|---------------| VBX_RND |-----|-----|---------|-------|-------------|---------------| | | |Frequency|Percent|Valid Percent|Cumulative | | | | | | |Percent | |-----|-----|---------|-------|-------------|---------------| |Valid|1 |207 |2.1 |2.1 |2.1 | | |2 |426 |4.3 |4.3 |6.3 | | |3 |432 |4.3 |4.3 |10.7 | | |4 |399 |4.0 |4.0 |14.6 | | |5 |418 |4.2 |4.2 |18.8 | | |-----|---------|-------|-------------|---------------| | |6 |422 |4.2 |4.2 |23.0 | | |7 |420 |4.2 |4.2 |27.2 | | |8 |415 |4.2 |4.2 |31.4 | | |9 |421 |4.2 |4.2 |35.6 | | |10 |424 |4.2 |4.2 |39.8 | | |-----|---------|-------|-------------|---------------| | |11 |403 |4.0 |4.0 |43.9 | | |12 |385 |3.9 |3.9 |47.7 | | |13 |417 |4.2 |4.2 |51.9 | | |14 |436 |4.4 |4.4 |56.3 | | |15 |397 |4.0 |4.0 |60.2 | | |-----|---------|-------|-------------|---------------| | |16 |413 |4.1 |4.1 |64.4 | | |17 |411 |4.1 |4.1 |68.5 | | |18 |393 |3.9 |3.9 |72.4 | | |19 |408 |4.1 |4.1 |76.5 | | |20 |419 |4.2 |4.2 |80.7 | | |-----|---------|-------|-------------|---------------| | |21 |422 |4.2 |4.2 |84.9 | | |22 |387 |3.9 |3.9 |88.8 | | |23 |434 |4.3 |4.3 |93.1 | | |24 |464 |4.6 |4.6 |97.7 | | |25 |227 |2.3 |2.3 |100.0 | | |-----|---------|-------|-------------|---------------| | |Total|10000 |100.0 |100.0 | | |-----|-----|---------|-------|-------------|---------------| VBX_TRNC |-----|-----|---------|-------|-------------|---------------| | | |Frequency|Percent|Valid Percent|Cumulative | | | | | | |Percent | |-----|-----|---------|-------|-------------|---------------| |Valid|1 |391 |3.9 |3.9 |3.9 | | |2 |441 |4.4 |4.4 |8.3 | | |3 |407 |4.1 |4.1 |12.4 | | |4 |407 |4.1 |4.1 |16.5 | | |5 |399 |4.0 |4.0 |20.5 | | |-----|---------|-------|-------------|---------------| | |6 |369 |3.7 |3.7 |24.1 | | |7 |393 |3.9 |3.9 |28.1 | | |8 |404 |4.0 |4.0 |32.1 | | |9 |390 |3.9 |3.9 |36.0 | | |10 |416 |4.2 |4.2 |40.2 | | |-----|---------|-------|-------------|---------------| | |11 |398 |4.0 |4.0 |44.2 | | |12 |390 |3.9 |3.9 |48.1 | | |13 |408 |4.1 |4.1 |52.1 | | |14 |396 |4.0 |4.0 |56.1 | | |15 |392 |3.9 |3.9 |60.0 | | |-----|---------|-------|-------------|---------------| | |16 |425 |4.3 |4.3 |64.3 | | |17 |435 |4.4 |4.4 |68.6 | | |18 |379 |3.8 |3.8 |72.4 | | |19 |418 |4.2 |4.2 |76.6 | | |20 |401 |4.0 |4.0 |80.6 | | |-----|---------|-------|-------------|---------------| | |21 |413 |4.1 |4.1 |84.7 | | |22 |385 |3.9 |3.9 |88.6 | | |23 |355 |3.6 |3.6 |92.1 | | |24 |404 |4.0 |4.0 |96.2 | | |25 |384 |3.8 |3.8 |100.0 | | |-----|---------|-------|-------------|---------------| | |Total|10000 |100.0 |100.0 | | |-----|-----|---------|-------|-------------|---------------| |
Richard is right. It should be
COMPUTE IVBX = TRUNC(RV.UNIFORM(1,26)). Art Richard Ristow wrote: > At 05:54 PM 2/5/2007, Jim Moffitt wrote: > >> How would one create a 300-case dataset where each case has a unique >> ID Ranging from 1 to 300 and a second variable named VBX which >> receives a random value ranging from 1 to 25? > > Responders solved this as a programming problem, adding the second > variable in the INPUT PROGRAM. > > The following computations were suggested by different posters for > calculating the "random values ranging from 1 to 25." All posters > assumed integers were wanted, which I accept as well. > > "Random ... ranging from 1 to 25" in principle allows any > distribution, but is commonly taken to mean equi-distribution, that > all values are equally likely. > > One suggested calculation doesn't meet this criterion: > . compute vbx=rnd(RV.uniform(1,25)). > Values 1 and 25 are only half as likely to be selected as are the > other 23 values. > > Another has, I think, the same problem to lesser degree: > . COMPUTE vbx =mod( rnd(uniform(1)*100),25)+1. > That's the above logic, but applied to the range 1-100 and then > 'wrapped', using MOD function, down to 1-25. Each value in range 1-25 > corresponds to four values in range 1-100, and only two of those (1 > and 100) have low probability of being selected; so the effect is > diluted. > > I recommend, as being simple and giving the desired equi-distribution, > . COMPUTE VBX = TRUNC(RV.UNIFORM(1,26)). > > The following is SPSSX 14 draft output, with some hand editing: > > NEW FILE. > INPUT PROGRAM. > . COMPUTE #N_CASES = 10000. > . NUMERIC ID (F5) > /VBX_MOD VBX_RND VBX_TRNC (F3). > . LOOP ID = 1 TO #N_CASES. > . COMPUTE VBX_MOD = mod( rnd(uniform(1)*100),25)+1. > . COMPUTE VBX_RND = rnd(RV.uniform(1,25)). > . COMPUTE VBX_TRNC = TRUNC(RV.UNIFORM(1,26)). > . END CASE. > . END LOOP. > END FILE. > END INPUT PROGRAM. > FREQUENCIES VBX_MOD VBX_RND VBX_TRNC. > > Frequencies > |--------------------------|------------------------| > |Output Created |06-FEB-2007 01:49:34 | > |--------------------------|------------------------| > Statistics [suppressed - no missing data] > > Frequency Table > > VBX_MOD > |-----|-----|---------|-------|-------------|---------------| > | | |Frequency|Percent|Valid Percent|Cumulative | > | | | | | |Percent | > |-----|-----|---------|-------|-------------|---------------| > |Valid|1 |385 |3.9 |3.9 |3.9 | > | |2 |382 |3.8 |3.8 |7.7 | > | |3 |416 |4.2 |4.2 |11.8 | > | |4 |377 |3.8 |3.8 |15.6 | > | |5 |401 |4.0 |4.0 |19.6 | > | |-----|---------|-------|-------------|---------------| > | |6 |393 |3.9 |3.9 |23.5 | > | |7 |424 |4.2 |4.2 |27.8 | > | |8 |404 |4.0 |4.0 |31.8 | > | |9 |398 |4.0 |4.0 |35.8 | > | |10 |420 |4.2 |4.2 |40.0 | > | |-----|---------|-------|-------------|---------------| > | |11 |375 |3.8 |3.8 |43.8 | > | |12 |450 |4.5 |4.5 |48.3 | > | |13 |428 |4.3 |4.3 |52.5 | > | |14 |396 |4.0 |4.0 |56.5 | > | |15 |364 |3.6 |3.6 |60.1 | > | |-----|---------|-------|-------------|---------------| > | |16 |412 |4.1 |4.1 |64.3 | > | |17 |402 |4.0 |4.0 |68.3 | > | |18 |381 |3.8 |3.8 |72.1 | > | |19 |419 |4.2 |4.2 |76.3 | > | |20 |376 |3.8 |3.8 |80.0 | > | |-----|---------|-------|-------------|---------------| > | |21 |397 |4.0 |4.0 |84.0 | > | |22 |411 |4.1 |4.1 |88.1 | > | |23 |420 |4.2 |4.2 |92.3 | > | |24 |353 |3.5 |3.5 |95.8 | > | |25 |416 |4.2 |4.2 |100.0 | > | |-----|---------|-------|-------------|---------------| > | |Total|10000 |100.0 |100.0 | | > |-----|-----|---------|-------|-------------|---------------| > > VBX_RND > |-----|-----|---------|-------|-------------|---------------| > | | |Frequency|Percent|Valid Percent|Cumulative | > | | | | | |Percent | > |-----|-----|---------|-------|-------------|---------------| > |Valid|1 |207 |2.1 |2.1 |2.1 | > | |2 |426 |4.3 |4.3 |6.3 | > | |3 |432 |4.3 |4.3 |10.7 | > | |4 |399 |4.0 |4.0 |14.6 | > | |5 |418 |4.2 |4.2 |18.8 | > | |-----|---------|-------|-------------|---------------| > | |6 |422 |4.2 |4.2 |23.0 | > | |7 |420 |4.2 |4.2 |27.2 | > | |8 |415 |4.2 |4.2 |31.4 | > | |9 |421 |4.2 |4.2 |35.6 | > | |10 |424 |4.2 |4.2 |39.8 | > | |-----|---------|-------|-------------|---------------| > | |11 |403 |4.0 |4.0 |43.9 | > | |12 |385 |3.9 |3.9 |47.7 | > | |13 |417 |4.2 |4.2 |51.9 | > | |14 |436 |4.4 |4.4 |56.3 | > | |15 |397 |4.0 |4.0 |60.2 | > | |-----|---------|-------|-------------|---------------| > | |16 |413 |4.1 |4.1 |64.4 | > | |17 |411 |4.1 |4.1 |68.5 | > | |18 |393 |3.9 |3.9 |72.4 | > | |19 |408 |4.1 |4.1 |76.5 | > | |20 |419 |4.2 |4.2 |80.7 | > | |-----|---------|-------|-------------|---------------| > | |21 |422 |4.2 |4.2 |84.9 | > | |22 |387 |3.9 |3.9 |88.8 | > | |23 |434 |4.3 |4.3 |93.1 | > | |24 |464 |4.6 |4.6 |97.7 | > | |25 |227 |2.3 |2.3 |100.0 | > | |-----|---------|-------|-------------|---------------| > | |Total|10000 |100.0 |100.0 | | > |-----|-----|---------|-------|-------------|---------------| > > VBX_TRNC > |-----|-----|---------|-------|-------------|---------------| > | | |Frequency|Percent|Valid Percent|Cumulative | > | | | | | |Percent | > |-----|-----|---------|-------|-------------|---------------| > |Valid|1 |391 |3.9 |3.9 |3.9 | > | |2 |441 |4.4 |4.4 |8.3 | > | |3 |407 |4.1 |4.1 |12.4 | > | |4 |407 |4.1 |4.1 |16.5 | > | |5 |399 |4.0 |4.0 |20.5 | > | |-----|---------|-------|-------------|---------------| > | |6 |369 |3.7 |3.7 |24.1 | > | |7 |393 |3.9 |3.9 |28.1 | > | |8 |404 |4.0 |4.0 |32.1 | > | |9 |390 |3.9 |3.9 |36.0 | > | |10 |416 |4.2 |4.2 |40.2 | > | |-----|---------|-------|-------------|---------------| > | |11 |398 |4.0 |4.0 |44.2 | > | |12 |390 |3.9 |3.9 |48.1 | > | |13 |408 |4.1 |4.1 |52.1 | > | |14 |396 |4.0 |4.0 |56.1 | > | |15 |392 |3.9 |3.9 |60.0 | > | |-----|---------|-------|-------------|---------------| > | |16 |425 |4.3 |4.3 |64.3 | > | |17 |435 |4.4 |4.4 |68.6 | > | |18 |379 |3.8 |3.8 |72.4 | > | |19 |418 |4.2 |4.2 |76.6 | > | |20 |401 |4.0 |4.0 |80.6 | > | |-----|---------|-------|-------------|---------------| > | |21 |413 |4.1 |4.1 |84.7 | > | |22 |385 |3.9 |3.9 |88.6 | > | |23 |355 |3.6 |3.6 |92.1 | > | |24 |404 |4.0 |4.0 |96.2 | > | |25 |384 |3.8 |3.8 |100.0 | > | |-----|---------|-------|-------------|---------------| > | |Total|10000 |100.0 |100.0 | | > |-----|-----|---------|-------|-------------|---------------| > > > > > |
Just to show that there is more than one way to skin a cat, try.
INPUT PROGRAM. LOOP id=1 TO 30000. COMPUTE IVBX = rnd(rv.unifrom(.5,25.5). END CASE. END LOOP. END FILE. END INPUT PROGRAM. FORMATS id (F3.0) vbx (f7.4) IVBX(F2). FREQUENCIES VARS= IVBX. Art Kendall Social Research Consultants Art Kendall wrote: > Richard is right. It should be > COMPUTE IVBX = TRUNC(RV.UNIFORM(1,26)). > > Art > > > Richard Ristow wrote: >> At 05:54 PM 2/5/2007, Jim Moffitt wrote: >> >>> How would one create a 300-case dataset where each case has a unique >>> ID Ranging from 1 to 300 and a second variable named VBX which >>> receives a random value ranging from 1 to 25? >> >> Responders solved this as a programming problem, adding the second >> variable in the INPUT PROGRAM. >> >> The following computations were suggested by different posters for >> calculating the "random values ranging from 1 to 25." All posters >> assumed integers were wanted, which I accept as well. >> >> "Random ... ranging from 1 to 25" in principle allows any >> distribution, but is commonly taken to mean equi-distribution, that >> all values are equally likely. >> >> One suggested calculation doesn't meet this criterion: >> . compute vbx=rnd(RV.uniform(1,25)). >> Values 1 and 25 are only half as likely to be selected as are the >> other 23 values. >> >> Another has, I think, the same problem to lesser degree: >> . COMPUTE vbx =mod( rnd(uniform(1)*100),25)+1. >> That's the above logic, but applied to the range 1-100 and then >> 'wrapped', using MOD function, down to 1-25. Each value in range 1-25 >> corresponds to four values in range 1-100, and only two of those (1 >> and 100) have low probability of being selected; so the effect is >> diluted. >> >> I recommend, as being simple and giving the desired equi-distribution, >> . COMPUTE VBX = TRUNC(RV.UNIFORM(1,26)). >> >> The following is SPSSX 14 draft output, with some hand editing: >> >> NEW FILE. >> INPUT PROGRAM. >> . COMPUTE #N_CASES = 10000. >> . NUMERIC ID (F5) >> /VBX_MOD VBX_RND VBX_TRNC (F3). >> . LOOP ID = 1 TO #N_CASES. >> . COMPUTE VBX_MOD = mod( rnd(uniform(1)*100),25)+1. >> . COMPUTE VBX_RND = rnd(RV.uniform(1,25)). >> . COMPUTE VBX_TRNC = TRUNC(RV.UNIFORM(1,26)). >> . END CASE. >> . END LOOP. >> END FILE. >> END INPUT PROGRAM. >> FREQUENCIES VBX_MOD VBX_RND VBX_TRNC. >> >> Frequencies >> |--------------------------|------------------------| >> |Output Created |06-FEB-2007 01:49:34 | >> |--------------------------|------------------------| >> Statistics [suppressed - no missing data] >> >> Frequency Table >> >> VBX_MOD >> |-----|-----|---------|-------|-------------|---------------| >> | | |Frequency|Percent|Valid Percent|Cumulative | >> | | | | | |Percent | >> |-----|-----|---------|-------|-------------|---------------| >> |Valid|1 |385 |3.9 |3.9 |3.9 | >> | |2 |382 |3.8 |3.8 |7.7 | >> | |3 |416 |4.2 |4.2 |11.8 | >> | |4 |377 |3.8 |3.8 |15.6 | >> | |5 |401 |4.0 |4.0 |19.6 | >> | |-----|---------|-------|-------------|---------------| >> | |6 |393 |3.9 |3.9 |23.5 | >> | |7 |424 |4.2 |4.2 |27.8 | >> | |8 |404 |4.0 |4.0 |31.8 | >> | |9 |398 |4.0 |4.0 |35.8 | >> | |10 |420 |4.2 |4.2 |40.0 | >> | |-----|---------|-------|-------------|---------------| >> | |11 |375 |3.8 |3.8 |43.8 | >> | |12 |450 |4.5 |4.5 |48.3 | >> | |13 |428 |4.3 |4.3 |52.5 | >> | |14 |396 |4.0 |4.0 |56.5 | >> | |15 |364 |3.6 |3.6 |60.1 | >> | |-----|---------|-------|-------------|---------------| >> | |16 |412 |4.1 |4.1 |64.3 | >> | |17 |402 |4.0 |4.0 |68.3 | >> | |18 |381 |3.8 |3.8 |72.1 | >> | |19 |419 |4.2 |4.2 |76.3 | >> | |20 |376 |3.8 |3.8 |80.0 | >> | |-----|---------|-------|-------------|---------------| >> | |21 |397 |4.0 |4.0 |84.0 | >> | |22 |411 |4.1 |4.1 |88.1 | >> | |23 |420 |4.2 |4.2 |92.3 | >> | |24 |353 |3.5 |3.5 |95.8 | >> | |25 |416 |4.2 |4.2 |100.0 | >> | |-----|---------|-------|-------------|---------------| >> | |Total|10000 |100.0 |100.0 | | >> |-----|-----|---------|-------|-------------|---------------| >> >> VBX_RND >> |-----|-----|---------|-------|-------------|---------------| >> | | |Frequency|Percent|Valid Percent|Cumulative | >> | | | | | |Percent | >> |-----|-----|---------|-------|-------------|---------------| >> |Valid|1 |207 |2.1 |2.1 |2.1 | >> | |2 |426 |4.3 |4.3 |6.3 | >> | |3 |432 |4.3 |4.3 |10.7 | >> | |4 |399 |4.0 |4.0 |14.6 | >> | |5 |418 |4.2 |4.2 |18.8 | >> | |-----|---------|-------|-------------|---------------| >> | |6 |422 |4.2 |4.2 |23.0 | >> | |7 |420 |4.2 |4.2 |27.2 | >> | |8 |415 |4.2 |4.2 |31.4 | >> | |9 |421 |4.2 |4.2 |35.6 | >> | |10 |424 |4.2 |4.2 |39.8 | >> | |-----|---------|-------|-------------|---------------| >> | |11 |403 |4.0 |4.0 |43.9 | >> | |12 |385 |3.9 |3.9 |47.7 | >> | |13 |417 |4.2 |4.2 |51.9 | >> | |14 |436 |4.4 |4.4 |56.3 | >> | |15 |397 |4.0 |4.0 |60.2 | >> | |-----|---------|-------|-------------|---------------| >> | |16 |413 |4.1 |4.1 |64.4 | >> | |17 |411 |4.1 |4.1 |68.5 | >> | |18 |393 |3.9 |3.9 |72.4 | >> | |19 |408 |4.1 |4.1 |76.5 | >> | |20 |419 |4.2 |4.2 |80.7 | >> | |-----|---------|-------|-------------|---------------| >> | |21 |422 |4.2 |4.2 |84.9 | >> | |22 |387 |3.9 |3.9 |88.8 | >> | |23 |434 |4.3 |4.3 |93.1 | >> | |24 |464 |4.6 |4.6 |97.7 | >> | |25 |227 |2.3 |2.3 |100.0 | >> | |-----|---------|-------|-------------|---------------| >> | |Total|10000 |100.0 |100.0 | | >> |-----|-----|---------|-------|-------------|---------------| >> >> VBX_TRNC >> |-----|-----|---------|-------|-------------|---------------| >> | | |Frequency|Percent|Valid Percent|Cumulative | >> | | | | | |Percent | >> |-----|-----|---------|-------|-------------|---------------| >> |Valid|1 |391 |3.9 |3.9 |3.9 | >> | |2 |441 |4.4 |4.4 |8.3 | >> | |3 |407 |4.1 |4.1 |12.4 | >> | |4 |407 |4.1 |4.1 |16.5 | >> | |5 |399 |4.0 |4.0 |20.5 | >> | |-----|---------|-------|-------------|---------------| >> | |6 |369 |3.7 |3.7 |24.1 | >> | |7 |393 |3.9 |3.9 |28.1 | >> | |8 |404 |4.0 |4.0 |32.1 | >> | |9 |390 |3.9 |3.9 |36.0 | >> | |10 |416 |4.2 |4.2 |40.2 | >> | |-----|---------|-------|-------------|---------------| >> | |11 |398 |4.0 |4.0 |44.2 | >> | |12 |390 |3.9 |3.9 |48.1 | >> | |13 |408 |4.1 |4.1 |52.1 | >> | |14 |396 |4.0 |4.0 |56.1 | >> | |15 |392 |3.9 |3.9 |60.0 | >> | |-----|---------|-------|-------------|---------------| >> | |16 |425 |4.3 |4.3 |64.3 | >> | |17 |435 |4.4 |4.4 |68.6 | >> | |18 |379 |3.8 |3.8 |72.4 | >> | |19 |418 |4.2 |4.2 |76.6 | >> | |20 |401 |4.0 |4.0 |80.6 | >> | |-----|---------|-------|-------------|---------------| >> | |21 |413 |4.1 |4.1 |84.7 | >> | |22 |385 |3.9 |3.9 |88.6 | >> | |23 |355 |3.6 |3.6 |92.1 | >> | |24 |404 |4.0 |4.0 |96.2 | >> | |25 |384 |3.8 |3.8 |100.0 | >> | |-----|---------|-------|-------------|---------------| >> | |Total|10000 |100.0 |100.0 | | >> |-----|-----|---------|-------|-------------|---------------| >> >> >> >> >> > > >
Art Kendall
Social Research Consultants |
Free forum by Nabble | Edit this page |