SPSSX Discussion

random sampling repeat 1000 times

Classic

List

Threaded

7 messages Options

lscfung

random sampling repeat 1000 times

I would like to randomly sample 25 numbers from a variable=id with values from 1 to 100
(i.e. id= 1,2,3,4,........100).

For one sample, I can just simply use SPSS syntax:

sample 25 from 100.

save outfile='d:\random.sav'.

Then these 25 random numbers will be saved under the filename 'random.sav'.

But I need to do this 1000 times, is there any way that I can somehow use

loop command to perform the task and save these 25 x 1000 random number

that are created into a file.

Your help is highly appreciated, thank you very much.

Linda

Art Kendall

Re: random sampling repeat 1000 times

open a new instance of SPSS. copy the syntax below into a syntax
window. Run it.

Is this what you want to do?

set seed = 20090128.
input program.
loop myset = 1 to 1000.
leave myset.
loop draw = 1 to 25.
COMPUTE x =rnd(rv.uniform(.5,100.5)).
end case.
end loop.
end loop.
end file.
end input program.
format myset (f4) draw(f2) x(f3).
execute.

Art Kendall
Social Research Consultants

lscfung wrote:

> I would like to randomly sample 25 numbers from a variable=id with values
> from 1 to 100
> (i.e. id= 1,2,3,4,........100).
>
> For one sample, I can just simply use SPSS syntax:
>
> sample 25 from 100.
>
> save outfile='d:\random.sav'.
>
> Then these 25 random numbers will be saved under the filename 'random.sav'.
>
> But I need to do this 1000 times, is there any way that I can somehow use
>
> loop command to perform the task and save these 25 x 1000 random number
>
> that are created into a file.
>
> Your help is highly appreciated, thank you very much.
>
> Linda
> --
> View this message in context: http://www.nabble.com/random-sampling-repeat-1000-times-tp21714650p21714650.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall
Social Research Consultants

soultogi

Re: random sampling repeat 1000 times

I need to sample from one dataset, but I couldn't write a syntax(I'm not familiar to the code).

As I can understand you can not write "sample" command in loop-end loop or do repeat-end repeat.

Is there any way to choose sample 100 times, not create numbers?

Plz help...

Bruce Weaver

Re: random sampling repeat 1000 times

Administrator

soultogi wrote

I need to sample from one dataset, but I couldn't write a syntax(I'm not familiar to the code).

As I can understand you can not write "sample" command in loop-end loop or do repeat-end repeat.

Is there any way to choose sample 100 times, not create numbers?

Plz help...

I assume you want to randomly sample 25 of 100 without replacement each time, is that right?

How about this?

* Generate a data file that has 1000 stacked copies of
* ID numbers 1-100.

INPUT PROGRAM .
LOOP SAMP=1 to 1000.
LOOP ID = 1 to 100.
END CASE.
LEAVE SAMP.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM .

* Compute a random number on which to order the cases
* within each sample. Then sort on SAMP and the random
* number.

compute x = rv.uniform(0,1).
sort cases by samp x.

* Now number the cases within each value of SAMP,
* then keep only the first 25 .

do if ($casenum EQ 1) or (samp NE lag(samp)).
- compute case = 1.
else.
- compute case = lag(case) + 1.
end if.
select if case LE 25.

* Verify that maximum value of CASE is 25,
* and that the number of samples = 1000.

descriptives case samp.

* CASE and X (the random variable) are no longer needed.

delete variables x case.

* Merge this dataset with one containing the same
* ID codes and other variables of interest.
* Both files must be sorted on ID, of course.

SORT CASES BY ID .
MATCH FILES / FILE * / TABLE 'Main File.SAV' / BY ID .

* After merging, sort by SAMP, and SPLIT FILE by SAMP, etc.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

soultogi

Re: random sampling repeat 1000 times

Thanks for prompt reply...

I didn't understand your sytax %100, I'm still at %70 but Let me explain myself...

I have two variables with correlation of .6 and I need to choose samples like in this code "sample(10 FROM 45)"... and I need to repeat this 100 times...

than I need to show sample distrubution...

Bruce Weaver

Re: random sampling repeat 1000 times

Administrator

soultogi wrote

Thanks for prompt reply...

I didn't understand your sytax %100, I'm still at %70 but Let me explain myself...

I have two variables with correlation of .6 and I need to choose samples like in this code "sample(10 FROM 45)"... and I need to repeat this 100 times...

than I need to show sample distrubution...

When I first read your request, I thought you wanted to draw samples WITHOUT replacement. But now I'm beginning to think that you are trying to do bootstrapping--and if I'm not mistaken, that entails sampling WITH replacement.

Okay, here's what I *think* you want to do:

1. You want to draw a large number of bootstrap samples (i.e., sampling with replacement) from your data file.

2. For each of those bootstrap samples, you want to compute r.

3. Then you want to plot a histogram of the r-values from all your bootstrap samples (this will be your empirical sampling distribution.

You might also want to work out the values that cut off 2.5% of the area in each tail of that sampling distribution.

How am I doing? If this is correct, see the following, which gives more or less the same solution provided by Art, I think. It goes a bit further showing that after you generate your bootstrap samples, you then have to merge that file of samples with the original data set. Then you could go on to use SPLIT FILE to get the desired statistic (Pearson r in your case) for each bootstrap sample. And if you use OMS, you can write them all to another dataset, which you can then use to plot the sampling distribution.

http://groups.google.com/group/sci.stat.consult/msg/710ea4ab83ddf24a?dmode=source

HTH.

soultogi

Re: random sampling repeat 1000 times

Thanks very much, that helped a lot...