All,
Here's what my data set looks like: G1 G2 G3 G 0.5 0.3 0.2 1 0.7 0.2 0.1 1 0.3 0.5 0.2 2 0.3 0.3 0.4 3 . . . Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote the following code to create variable G. IF (G1>G2 and G1>G3) G=3. IF (G2>G1 and G2>G3) G=2. IF (G1>G2 and G1>G3) G=1. EXECUTE. Obviously this code does not assign values to G when there are multiple groups with the highest proportion. For example, G1=.40,G2=.40, G3=0.2 G1=.50,G3=.50, G3=0.0 etc. In these situations, I would like to RANDOMLY assign the case to one of the equally high categories. In other words, for G1=.40, G2=.40, G3=.20 I would like there to be a 50/50 chance that G would be assigned a value of 1 or 2. Any help writing the code to accomplish this task would be most appreciated. Thanks, Ryan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Here is a programmability solution using
the SPSSINC TRANS extension command available from the SPSS Community (www.ibm.com/developerworks/spssdevcentral).
It works for any number of variables.
The data: data list free /g1 g2 g3. begin data .5 .3 .2 .5 .3 .5 .1 .1 .1 .5 .6 .7 end data. The algorithm: begin program. import random def maxvar(*args): """return 1-based index of maximum argument breaking ties randomly""" lis = sorted([(v,i) for i,v in enumerate(args)], reverse=True) ties = [item for item in lis if item[0] == lis[0][0]] return ties[random.randint(0, len(ties) - 1)][1] + 1 end program. Usage example: spssinc trans result=gindex /formula "maxvar(g1,g2,g3)". The algorithm sorts the values in reverse order keeping track of the indexes. Then it samples randomly from ties and returns the index. In the usage formula, any number of variable names can be listed. HTH, Jon Peck Senior Software Engineer, IBM [hidden email] 312-651-3435 From: R B <[hidden email]> To: [hidden email] Date: 02/02/2011 09:30 AM Subject: [SPSSX-L] Code for Random Assignment Sent by: "SPSSX(r) Discussion" <[hidden email]> All, Here's what my data set looks like: G1 G2 G3 G 0.5 0.3 0.2 1 0.7 0.2 0.1 1 0.3 0.5 0.2 2 0.3 0.3 0.4 3 . . . Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote the following code to create variable G. IF (G1>G2 and G1>G3) G=3. IF (G2>G1 and G2>G3) G=2. IF (G1>G2 and G1>G3) G=1. EXECUTE. Obviously this code does not assign values to G when there are multiple groups with the highest proportion. For example, G1=.40,G2=.40, G3=0.2 G1=.50,G3=.50, G3=0.0 etc. In these situations, I would like to RANDOMLY assign the case to one of the equally high categories. In other words, for G1=.40, G2=.40, G3=.20 I would like there to be a 50/50 chance that G would be assigned a value of 1 or 2. Any help writing the code to accomplish this task would be most appreciated. Thanks, Ryan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Ryan
Hello Ryan,
First of all let me point out that the first and third line of your G assignment code contain the same assignment rule but different outcome. I suspect that the code should be: IF (G3>G1 and G3>G2) G=3. IF (G2>G1 and G2>G3) G=2. IF (G1>G2 and G1>G3) G=1. EXECUTE. Having said that, when ties arise amongst the top 2 variables you could use something like: COMPUTE X=RV.UNIFORM(0,1). IF (G1=G2 and G1>G3 AND X<=0.5) G=1. IF (G1=G2 and G1>G3 AND X>0.5) G=2. IF (G1=G3 and G1>G2 AND X<=0.5) G=1. IF (G1=G3 and G1>G2 AND X>0.5) G=3. IF (G2=G3 and G2>G1 AND X<=0.5) G=2. IF (G2=G3 and G2>G1 AND X>0.5) G=3. Some similar code could be used if ties arises amongst the bottom 2 variables, e.g.: IF (G1>G2 and G2=G3) G=1. Etc... HTH, Luca Mr. Luca Meyer www.lucameyer.com
Mr. Luca Meyer
www.lucameyer.com |
Luca,
You are absolutely correct that the 3rd line of code should have been rewritten as you suggested. Regarding the original question, the code you provided works perfectly! For those interested, I decided to add the line "set seed <value>." [with an acutal value] to ensure that I observe identical results each time I run the code. Thanks again. Ryan On Wed, Feb 2, 2011 at 1:02 PM, lucameyer <[hidden email]> wrote: > Hello Ryan, > > First of all let me point out that the first and third line of your G > assignment code contain the same assignment rule but different outcome. I > suspect that the code should be: > > IF (G3>G1 and G3>G2) G=3. > IF (G2>G1 and G2>G3) G=2. > IF (G1>G2 and G1>G3) G=1. > EXECUTE. > > Having said that, when ties arise amongst the top 2 variables you could use > something like: > > COMPUTE X=RV.UNIFORM(0,1). > IF (G1=G2 and G1>G3 AND X<=0.5) G=1. > IF (G1=G2 and G1>G3 AND X>0.5) G=2. > IF (G1=G3 and G1>G2 AND X<=0.5) G=1. > IF (G1=G3 and G1>G2 AND X>0.5) G=3. > IF (G2=G3 and G2>G1 AND X<=0.5) G=2. > IF (G2=G3 and G2>G1 AND X>0.5) G=3. > > Some similar code could be used if ties arises amongst the bottom 2 > variables, e.g.: > > IF (G1>G2 and G2=G3) G=1. > > Etc... > > HTH, > Luca > > Mr. Luca Meyer > www.lucameyer.com > > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Code-for-Random-Assignment-tp3368053p3368202.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Ryan
See if the syntax below does what you want.
Art Kendall Social Research Consultants * it appears you want to say which of g1 to g3 is the highest. * with random assignment among variables when there is a tie for max. data list list/g1 g2 g3 (3f4.1) want(f1). begin data 0.5 0.3 0.2 1 0.7 0.2 0.1 1 0.3 0.5 0.2 2 0.3 0.3 0.4 3 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.4 0.4 0.2 4 0.5 0.5 0.0 4 0.5 0.5 0.0 4 0.5 0.5 0.0 4 0.5 0.5 0.0 4 0.5 0.5 0.0 4 end data. IF (G1 gt G2 and G1 gt G3) G=3. IF (G2 gt G1 and G2 gt G3) G=2. IF (G1 gt G2 and G1 gt G3) G=1. compute maxg = max(g1 to g3). compute g1max = g1 eq maxg. compute g2max = g2 eq maxg. compute g3max = g3 eq maxg. count numties = g1max to g3max(1). do if numties eq 1. if g1max newg=1. if g2max newg=2. if g3max newg=3. else if numties eq 2. do if g1max and g2max. compute newg =2. if rv.bernoulli(.5) newg=1. else if g1max and g3max. compute newg =3. if rv.bernoulli(.5) newg=1. else if g2max and g3max. compute newg=3. if rv.bernoulli(.5) newg= 2. end if. else if numties eq 3. compute newg = rnd(rv.uniform(.5,3.5)). end if. formats G (f1) maxg (f5.1) g1max to newg(f1). EXECUTE. On 2/2/2011 11:20 AM, R B wrote: > All, > > Here's what my data set looks like: > > G1 G2 G3 G > 0.5 0.3 0.2 1 > 0.7 0.2 0.1 1 > 0.3 0.5 0.2 2 > 0.3 0.3 0.4 3 > . > . > . > > Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote > the following code to create variable G. > > IF (G1>G2 and G1>G3) G=3. > IF (G2>G1 and G2>G3) G=2. > IF (G1>G2 and G1>G3) G=1. > EXECUTE. > > Obviously this code does not assign values to G when there are > multiple groups with the highest proportion. For example, > > G1=.40,G2=.40, G3=0.2 > G1=.50,G3=.50, G3=0.0 > etc. > > In these situations, I would like to RANDOMLY assign the case to one > of the equally high categories. In other words, for > > G1=.40, G2=.40, G3=.20 > > I would like there to be a 50/50 chance that G would be assigned a > value of 1 or 2. > > Any help writing the code to accomplish this task would be most appreciated. > > Thanks, > > Ryan > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Administrator
|
In reply to this post by Ryan
Here is a non python solution which can be easily generalized to ANY number of variables. Luca's approach might work but I wouldn't attempt to generalize it to more than 3 variables or break a 3 way tie;-( ----------- data list free / g1 g2 g3 . begin data .3 .3 .4 .6 .2 .2 .4 .4 .2 .5 .0 .5 .333 .333 .333 end data. COMPUTE #gx=max(g1 TO g3). VECTOR #INMAX(3). DO REPEAT #mcount= g1 TO G3 /#=1 TO 3. + COMPUTE #INMAX(#)= (#mcount EQ #gx) * UNIFORM(1). END REPEAT. COMPUTE #GR=MAX(#INMAX1 TO #INMAX3). LOOP #=1 TO 3 . + IF (#INMAX(#) EQ #GR) G=#. END LOOP. LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Yes, the dreaded 3-way tie...I don't believe I have any, but I'll be
certain to double check. Thank you and to everyone else who's responded with various solutions. I'll be sure to explore each solution. I very much appreciate the support. Best wishes, Ryan On Wed, Feb 2, 2011 at 2:13 PM, David Marso <[hidden email]> wrote: > Here is a non python solution which can be easily generalized to ANY number > of variables. > Luca's approach might work but I wouldn't attempt to generalize it to more > than 3 variables > or break a 3 way tie;-( > ----------- > data list free / g1 g2 g3 . > begin data > .3 .3 .4 > .6 .2 .2 > .4 .4 .2 > .5 .0 .5 > .333 .333 .333 > end data. > > COMPUTE #gx=max(g1 TO g3). > VECTOR #INMAX(3). > DO REPEAT #mcount= g1 TO G3 > /#=1 TO 3. > + COMPUTE #INMAX(#)= (#mcount EQ #gx) * UNIFORM(1). > END REPEAT. > COMPUTE #GR=MAX(#INMAX1 TO #INMAX3). > LOOP #=1 TO 3 . > + IF (#INMAX(#) EQ #GR) G=#. > END LOOP. > LIST. > > > R B wrote: >> >> All, >> >> Here's what my data set looks like: >> >> G1 G2 G3 G >> 0.5 0.3 0.2 1 >> 0.7 0.2 0.1 1 >> 0.3 0.5 0.2 2 >> 0.3 0.3 0.4 3 >> . >> . >> . >> >> Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote >> the following code to create variable G. >> >> IF (G1>G2 and G1>G3) G=3. >> IF (G2>G1 and G2>G3) G=2. >> IF (G1>G2 and G1>G3) G=1. >> EXECUTE. >> >> Obviously this code does not assign values to G when there are >> multiple groups with the highest proportion. For example, >> >> G1=.40,G2=.40, G3=0.2 >> G1=.50,G3=.50, G3=0.0 >> etc. >> >> In these situations, I would like to RANDOMLY assign the case to one >> of the equally high categories. In other words, for >> >> G1=.40, G2=.40, G3=.20 >> >> I would like there to be a 50/50 chance that G would be assigned a >> value of 1 or 2. >> >> Any help writing the code to accomplish this task would be most >> appreciated. >> >> Thanks, >> >> Ryan >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> >> > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Code-for-Random-Assignment-tp3368053p3368310.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |