|
I am a novice user of SPSS. I love your site. It is full of
usefull sintax. My general goal is to generate a population and then repeatedly take samples (with replacement or without) from that population. Doing so, the properties of some estimators can be evaluated by averaging the estimates that come from these many samples. The problem: the sample is a random walk on a network structure. To be more precise, I have a file with the data of a network structure. The file is organized like this. For each observation, there are two lines with information. The first contains the observation ID and the number of links. For example: 21068 2 21017 21108 21108 3 21088 21017 21068 21017 6 21063 21034 21088 21071 21068 21108 21107 6 21016 21074 21115 21010 21088 21110 21010 3 21115 21070 21107 21088 8 21030 21021 21096 21110 21034 21017 21108 21107 21070 6 21080 21054 21016 21086 21115 21010 21110 6 21074 21021 21096 21075 21088 21107 The second line contains the ID values of the links. I would like simulate a sampling process like a random walk or snowball sampling on this data. First, a set of s seeds is chosen to make up wave 0. Then we randomly sample two/three id from links of every seed to recruit people for wave 1.This process continues until the desired sample size is reached. The selection should be with and without replacement. Could you give me some ideas, suggest some programs/macro to do this in Spss? Thanks in advance Alberto Vitalini ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Alberto,
I'm not sure that I understand what you want to do but I think I it is this: Generate an x% random sample of observation ids For each observation id selected, randomly select one (or two) links. Do you agree with this statement of the problem? Also, are you reading this data in from a text file or have you been given it in a spss file? The reason I ask is that it would be much easier to work with if it were one line per case, like this 21068 2 21017 21108 21108 3 21088 21017 21068 21017 6 21063 21034 21088 21071 21068 21108 instead of two lines per case. So can you re-read the data so that it is one line per case? If that were done, then the problem is pretty simple. * let oip be the proportion of observation ids to be selected. * let nlink be the number of links per id. * let link1 to linkn be the names of the link vars. Compute pickoi=uniform(1). Recode pickoi(0 thru oip=1)(else=0). * assume one link per selected observation id. Do if (pickoi eq 1). + if (nlink gt 1) pickl=trunc(uniform(nlink))+1. + if (nlink eq 1) pickl=1. End if. Vector link=link1 to linkn. Compute selectedlink=link(pickl). Execute. Making two selections of links is somewhat harder and you'll need to specify whether the sampling is with or without replacement. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
