Random Sample selection for multiple times from the same data file

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Random Sample selection for multiple times from the same data file

ajay atluri
Hi Team,

I am working on a data file with 10,000 observations of consumers. And we
would like to select a sample of 5% from the list and calculate the mean of
it.  We can do it by going into DATA tab and by selecting SELECT CASES. We
have an option of random samples.

But I would like to repeat the process for 15 to 20 times by including all
the observations. Our motto of repating the process is to compare the 15 to
20 iterations mean with the total mean and finding the behavior pattern of
consumers.

If any one in the list knows how to proceed please please help  me in this
situtaion.

Sharing the syntax is grateful.


Ajay Atluri

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Sample selection for multiple times from the same data file

Mark Palmberg
Can you accomplish this by setting your 5% sample figure and then changing
the seed for each selection?

On Mon, Jul 14, 2008 at 8:09 AM, ajay atluri <[hidden email]> wrote:

> Hi Team,
>
> I am working on a data file with 10,000 observations of consumers. And we
> would like to select a sample of 5% from the list and calculate the mean of
> it.  We can do it by going into DATA tab and by selecting SELECT CASES. We
> have an option of random samples.
>
> But I would like to repeat the process for 15 to 20 times by including all
> the observations. Our motto of repating the process is to compare the 15 to
> 20 iterations mean with the total mean and finding the behavior pattern of
> consumers.
>
> If any one in the list knows how to proceed please please help  me in this
> situtaion.
>
> Sharing the syntax is grateful.
>
>
> Ajay Atluri
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Random Sample selection for multiple times

Richard Ristow
At 09:13 AM 7/14/2008, Mark Palmberg wrote:

>Can you accomplish this by setting your 5% sample figure and then
>changing the seed for each selection?

Whatever you do, I don't recommend that. Random-number generators are
designed to produce a stream of numbers that closely resemble
independent identically-distributed numbers. There's no guarantee
that a succession of streams, created by re-initializing, will meet
this criterion.

To get 15, say, independent 5% samples from the same dataset, you
could write logic to select one sample, then run that logic
repeatedly within a macro or Python loop. But I'd do it by creating
15 variables, each a flag for whether the case is in that one of the
15 samples; set those flags variables randomly, using the k/n method
or some such; and (probably) VARSTOCASES to create a separate set of
records for each of the 15 samples.

(That's for *independent* samples, which commonly overlap. For
*disjoint* samples, like 15 randomly-selected non-overlapping
batches, you use different techniques.)

Finally, ajay atluri wrote,

>>[We will] compare the 15 to 20 iterations mean with the total mean
>>and find the behavior pattern of consumers.

I don't at all see that this (which is a kind of bootstrap method)
will do what you want, but good luck to you.

-With best wishes,
  Richard

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD