How to draw a random sample for 30 cities, 50 retailers?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to draw a random sample for 30 cities, 50 retailers?

Sanjay Seth, MBA, Ph.D.
Hello Everybody,

I have a file that has 50,000 customers IDs, who belongs to 30 different
cities and buy products from 50 different retailers. We want to draw a
sample of only 5,000 customers (10%) for a survey but this sample should
reflect the similar proportion of cities and retailers as of population.

Suppose if city A has 5% market share in the population (50,000) then the
random sample (5,000) should also have 5% customers from city A. Also within
city A the customers of all retailers should be included in similar
proportion as of population. It is possible that some retailers might not
operate in some cities also.

I learnt that "Complex Sample" could be used. I appreciate it if somebody
could tell me how this simple sample could be drawn by using "complex
samples" module.

ID    Cities   Retailers

1      1           1
2      1           1
3     1            2
4       1               2
5       1               3
6       1               3
7       1               3
8       2               1
9       2               1
10      2               2

399998 29               1
399999 30               49
400000 30               50


Any pointer would be highly appreciated.

I have reviewed the tutorial but could not any useful example.

It appears that "cities" variable could be used as "strata" and "retailers"
as "clusters" but how to specify 10% (only 5,000) to be selected from total
population (50,000) and keeping the population proportions for cities and
retailers (within cities).

Look forward to hear from some experts soon.


Best regards,
Sasa

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to draw a random sample for 30 cities, 50 retailers?

Maguin, Eugene
Sanjay,

>>I have a file that has 50,000 customers IDs, who belongs to 30 different
cities and buy products from 50 different retailers. We want to draw a
sample of only 5,000 customers (10%) for a survey but this sample should
reflect the similar proportion of cities and retailers as of population.

Suppose if city A has 5% market share in the population (50,000) then the
random sample (5,000) should also have 5% customers from city A. Also within
city A the customers of all retailers should be included in similar
proportion as of population. It is possible that some retailers might not
operate in some cities also.

I'll assume that cities are numbered 1-30 and retailers are numbered 1-50.

Compute citystore=city*100+retailer.

Sort cases by citystore.

Aggregate outfile=* mode=addvariables/break=citystore/count=nu.

Compute ranvar=uniform(1).

Sort cases by citystore ranvar.

Compute seq=1.
If (citystore eq lag(citystore)) seq=lag(seq)+1.
If (seq gt .10*count) seq=0.
Select if (seq ne 0).
Execute.

*  this should give you a sample of nearly exactly 5,000 with nearly exactly
a 10% sample of every city and store combination. See if this is what you
need.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD