SPSSX Discussion

Simulate snowbal sample in network population

Classic

List

Threaded

2 messages Options

Alberto Vitalini

Simulate snowbal sample in network population

I am a novice user of SPSS. I love your site. It is full of
usefull sintax.
My general goal is to generate a population and then
repeatedly take samples (with replacement or without) from
that population. Doing so, the properties of some
estimators can be evaluated by averaging the estimates that
come from these many samples.
The problem: the sample is a random walk on a network
structure. To be more precise, I have a file with the data
of a network structure. The file is organized like this.
For each observation, there are two lines with information.
The first contains the observation ID and the number of
links.
For example:

21068 2
21017 21108
21108 3
21088 21017 21068
21017 6
21063 21034 21088 21071 21068 21108
21107 6
21016 21074 21115 21010 21088 21110
21010 3
21115 21070 21107
21088 8
21030 21021 21096 21110 21034 21017 21108 21107
21070 6
21080 21054 21016 21086 21115 21010
21110 6
21074 21021 21096 21075 21088 21107

The second line contains the ID values of the links. I
would like simulate a sampling process like a random walk
or snowball sampling on this data. First, a set of s seeds
is chosen to make up wave 0. Then we randomly sample
two/three id from links of every seed to recruit people for
wave 1.This process continues until the desired sample size
is reached. The selection should be with and without
replacement.
Could you give me some ideas, suggest some programs/macro
to do this in Spss?
Thanks in advance
Alberto Vitalini

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Maguin, Eugene

Re: Simulate snowbal sample in network population

Alberto,

I'm not sure that I understand what you want to do but I think I it is this:

Generate an x% random sample of observation ids
For each observation id selected, randomly select one (or two) links.

Do you agree with this statement of the problem?

Also, are you reading this data in from a text file or have you been given
it in a spss file? The reason I ask is that it would be much easier to work
with if it were one line per case, like this

21068 2 21017 21108
21108 3 21088 21017 21068
21017 6 21063 21034 21088 21071 21068 21108

instead of two lines per case. So can you re-read the data so that it is one
line per case?

If that were done, then the problem is pretty simple.

* let oip be the proportion of observation ids to be selected.
* let nlink be the number of links per id.
* let link1 to linkn be the names of the link vars.
Compute pickoi=uniform(1).
Recode pickoi(0 thru oip=1)(else=0).

* assume one link per selected observation id.
Do if (pickoi eq 1).
+ if (nlink gt 1) pickl=trunc(uniform(nlink))+1.
+ if (nlink eq 1) pickl=1.
End if.

Vector link=link1 to linkn.
Compute selectedlink=link(pickl).
Execute.

Making two selections of links is somewhat harder and you'll need to specify
whether the sampling is with or without replacement.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD