Code for Random Assignment

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Code for Random Assignment

Ryan
All,

Here's what my data set looks like:

G1  G2  G3   G
0.5  0.3  0.2   1
0.7  0.2  0.1   1
0.3  0.5  0.2   2
0.3  0.3  0.4   3
.
.
.

Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote
the following code to create variable G.

IF  (G1>G2 and G1>G3) G=3.
IF  (G2>G1 and G2>G3) G=2.
IF  (G1>G2 and G1>G3) G=1.
EXECUTE.

Obviously this code does not assign values to G when there are
multiple groups with the highest proportion. For example,

G1=.40,G2=.40, G3=0.2
G1=.50,G3=.50, G3=0.0
etc.

In these situations, I would like to RANDOMLY assign the case to one
of the equally high categories. In other words, for

G1=.40, G2=.40, G3=.20

I would like there to be a 50/50 chance that G would be assigned a
value of 1 or 2.

Any help writing the code to accomplish this task would be most appreciated.

Thanks,

Ryan

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Code for Random Assignment

Jon K Peck
Here is a programmability solution using the SPSSINC TRANS extension command available from the SPSS Community (www.ibm.com/developerworks/spssdevcentral).  It works for any number of variables.


The data:

data list free /g1 g2 g3.
begin data
.5 .3 .2
.5 .3 .5
.1 .1 .1
.5 .6 .7
end data.


The algorithm:

begin program.
import random
def maxvar(*args):
  """return 1-based index of maximum argument breaking ties randomly"""
  lis = sorted([(v,i) for i,v in enumerate(args)], reverse=True)
  ties = [item for item in lis if item[0] == lis[0][0]]
  return ties[random.randint(0, len(ties) - 1)][1] + 1
end program.

Usage example:

spssinc trans result=gindex
/formula "maxvar(g1,g2,g3)".

The algorithm sorts the values in reverse order keeping track of the indexes.  Then it samples randomly from ties and returns the index.
In the usage formula, any number of variable names can be listed.

HTH,


Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435




From:        R B <[hidden email]>
To:        [hidden email]
Date:        02/02/2011 09:30 AM
Subject:        [SPSSX-L] Code for Random Assignment
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




All,

Here's what my data set looks like:

G1  G2  G3   G
0.5  0.3  0.2   1
0.7  0.2  0.1   1
0.3  0.5  0.2   2
0.3  0.3  0.4   3
.
.
.

Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote
the following code to create variable G.

IF  (G1>G2 and G1>G3) G=3.
IF  (G2>G1 and G2>G3) G=2.
IF  (G1>G2 and G1>G3) G=1.
EXECUTE.

Obviously this code does not assign values to G when there are
multiple groups with the highest proportion. For example,

G1=.40,G2=.40, G3=0.2
G1=.50,G3=.50, G3=0.0
etc.

In these situations, I would like to RANDOMLY assign the case to one
of the equally high categories. In other words, for

G1=.40, G2=.40, G3=.20

I would like there to be a 50/50 chance that G would be assigned a
value of 1 or 2.

Any help writing the code to accomplish this task would be most appreciated.

Thanks,

Ryan

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Code for Random Assignment

lucameyer
In reply to this post by Ryan
Hello Ryan,

First of all let me point out that the first and third line of your G assignment code contain the same assignment rule but different outcome. I suspect that the code should be:

IF  (G3>G1 and G3>G2) G=3.
IF  (G2>G1 and G2>G3) G=2.
IF  (G1>G2 and G1>G3) G=1.
EXECUTE.

Having said that, when ties arise amongst the top 2 variables you could use something like:

COMPUTE X=RV.UNIFORM(0,1).
IF (G1=G2 and G1>G3 AND X<=0.5) G=1.
IF (G1=G2 and G1>G3 AND X>0.5) G=2.
IF (G1=G3 and G1>G2 AND X<=0.5) G=1.
IF (G1=G3 and G1>G2 AND X>0.5) G=3.
IF (G2=G3 and G2>G1 AND X<=0.5) G=2.
IF (G2=G3 and G2>G1 AND X>0.5) G=3.

Some similar code could be used if ties arises amongst the bottom 2 variables, e.g.:

IF (G1>G2 and G2=G3) G=1.

Etc...

HTH,
Luca

Mr. Luca Meyer
www.lucameyer.com

Mr. Luca Meyer
www.lucameyer.com
Reply | Threaded
Open this post in threaded view
|

Re: Code for Random Assignment

Ryan
Luca,

You are absolutely correct that the 3rd line of code should have been
rewritten as you suggested. Regarding the original question, the code
you provided works perfectly!

For those interested, I decided to add the line "set seed <value>."
[with an acutal value] to ensure that I observe identical results each
time I run the code.

Thanks again.

Ryan

On Wed, Feb 2, 2011 at 1:02 PM, lucameyer <[hidden email]> wrote:

> Hello Ryan,
>
> First of all let me point out that the first and third line of your G
> assignment code contain the same assignment rule but different outcome. I
> suspect that the code should be:
>
> IF  (G3>G1 and G3>G2) G=3.
> IF  (G2>G1 and G2>G3) G=2.
> IF  (G1>G2 and G1>G3) G=1.
> EXECUTE.
>
> Having said that, when ties arise amongst the top 2 variables you could use
> something like:
>
> COMPUTE X=RV.UNIFORM(0,1).
> IF (G1=G2 and G1>G3 AND X<=0.5) G=1.
> IF (G1=G2 and G1>G3 AND X>0.5) G=2.
> IF (G1=G3 and G1>G2 AND X<=0.5) G=1.
> IF (G1=G3 and G1>G2 AND X>0.5) G=3.
> IF (G2=G3 and G2>G1 AND X<=0.5) G=2.
> IF (G2=G3 and G2>G1 AND X>0.5) G=3.
>
> Some similar code could be used if ties arises amongst the bottom 2
> variables, e.g.:
>
> IF (G1>G2 and G2=G3) G=1.
>
> Etc...
>
> HTH,
> Luca
>
> Mr. Luca Meyer
> www.lucameyer.com
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Code-for-Random-Assignment-tp3368053p3368202.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Code for Random Assignment

Art Kendall
In reply to this post by Ryan
See if the syntax below does what you want.

Art Kendall
Social Research Consultants

* it appears you want to say which of g1 to g3 is the highest.
* with random assignment among variables when there is a tie for max.
data list list/g1 g2 g3 (3f4.1) want(f1).
begin data
0.5  0.3  0.2   1
0.7  0.2  0.1   1
0.3  0.5  0.2   2
0.3  0.3  0.4   3
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.4  0.4  0.2   4
0.5  0.5  0.0   4
0.5  0.5  0.0   4
0.5  0.5  0.0   4
0.5  0.5  0.0   4
0.5  0.5  0.0   4
end data.
IF  (G1 gt G2 and G1 gt G3) G=3.
IF  (G2 gt G1 and G2 gt G3) G=2.
IF  (G1 gt G2 and G1 gt G3) G=1.
compute maxg = max(g1 to g3).
compute g1max = g1 eq maxg.
compute g2max = g2 eq maxg.
compute g3max = g3 eq maxg.
count numties = g1max to g3max(1).
do if numties eq 1.
  if g1max newg=1.
  if g2max newg=2.
  if g3max newg=3.
else if numties eq 2.
     do if g1max and g2max.
       compute newg =2.
       if rv.bernoulli(.5) newg=1.
     else if g1max and g3max.
       compute newg =3.
       if rv.bernoulli(.5) newg=1.
     else if g2max and g3max.
       compute newg=3.
       if rv.bernoulli(.5) newg= 2.
     end if.
else if numties eq 3.
     compute newg = rnd(rv.uniform(.5,3.5)).
end if.
formats G (f1) maxg (f5.1) g1max to newg(f1).
EXECUTE.


On 2/2/2011 11:20 AM, R B wrote:

> All,
>
> Here's what my data set looks like:
>
> G1  G2  G3   G
> 0.5  0.3  0.2   1
> 0.7  0.2  0.1   1
> 0.3  0.5  0.2   2
> 0.3  0.3  0.4   3
> .
> .
> .
>
> Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote
> the following code to create variable G.
>
> IF  (G1>G2 and G1>G3) G=3.
> IF  (G2>G1 and G2>G3) G=2.
> IF  (G1>G2 and G1>G3) G=1.
> EXECUTE.
>
> Obviously this code does not assign values to G when there are
> multiple groups with the highest proportion. For example,
>
> G1=.40,G2=.40, G3=0.2
> G1=.50,G3=.50, G3=0.0
> etc.
>
> In these situations, I would like to RANDOMLY assign the case to one
> of the equally high categories. In other words, for
>
> G1=.40, G2=.40, G3=.20
>
> I would like there to be a 50/50 chance that G would be assigned a
> value of 1 or 2.
>
> Any help writing the code to accomplish this task would be most appreciated.
>
> Thanks,
>
> Ryan
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Code for Random Assignment

David Marso
Administrator
In reply to this post by Ryan

Here is a non python solution which can be easily generalized to ANY number of variables.
Luca's approach might work but I wouldn't attempt to generalize it to more than 3 variables
or break a 3 way tie;-(
-----------
data list free / g1 g2 g3 .
begin data
 .3 .3 .4
 .6 .2 .2
 .4 .4 .2
 .5 .0 .5
.333 .333 .333
end data.

COMPUTE #gx=max(g1 TO g3).
VECTOR #INMAX(3).
DO REPEAT #mcount= g1 TO G3
         /#=1 TO 3.
+  COMPUTE #INMAX(#)=  (#mcount EQ #gx) * UNIFORM(1).
END REPEAT.
COMPUTE #GR=MAX(#INMAX1 TO #INMAX3).
LOOP #=1 TO 3 .
+  IF (#INMAX(#) EQ #GR) G=#.
END LOOP.
LIST.

R B wrote
All,

Here's what my data set looks like:

G1  G2  G3   G
0.5  0.3  0.2   1
0.7  0.2  0.1   1
0.3  0.5  0.2   2
0.3  0.3  0.4   3
.
.
.

Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote
the following code to create variable G.

IF  (G1>G2 and G1>G3) G=3.
IF  (G2>G1 and G2>G3) G=2.
IF  (G1>G2 and G1>G3) G=1.
EXECUTE.

Obviously this code does not assign values to G when there are
multiple groups with the highest proportion. For example,

G1=.40,G2=.40, G3=0.2
G1=.50,G3=.50, G3=0.0
etc.

In these situations, I would like to RANDOMLY assign the case to one
of the equally high categories. In other words, for

G1=.40, G2=.40, G3=.20

I would like there to be a 50/50 chance that G would be assigned a
value of 1 or 2.

Any help writing the code to accomplish this task would be most appreciated.

Thanks,

Ryan

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Code for Random Assignment

Ryan
Yes, the dreaded 3-way tie...I don't believe I have any, but I'll be
certain to double check. Thank you and to everyone else who's
responded with various solutions. I'll be sure to explore each
solution.

I very much appreciate the support.

Best wishes,

Ryan

On Wed, Feb 2, 2011 at 2:13 PM, David Marso <[hidden email]> wrote:

> Here is a non python solution which can be easily generalized to ANY number
> of variables.
> Luca's approach might work but I wouldn't attempt to generalize it to more
> than 3 variables
> or break a 3 way tie;-(
> -----------
> data list free / g1 g2 g3 .
> begin data
>  .3 .3 .4
>  .6 .2 .2
>  .4 .4 .2
>  .5 .0 .5
> .333 .333 .333
> end data.
>
> COMPUTE #gx=max(g1 TO g3).
> VECTOR #INMAX(3).
> DO REPEAT #mcount= g1 TO G3
>         /#=1 TO 3.
> +  COMPUTE #INMAX(#)=  (#mcount EQ #gx) * UNIFORM(1).
> END REPEAT.
> COMPUTE #GR=MAX(#INMAX1 TO #INMAX3).
> LOOP #=1 TO 3 .
> +  IF (#INMAX(#) EQ #GR) G=#.
> END LOOP.
> LIST.
>
>
> R B wrote:
>>
>> All,
>>
>> Here's what my data set looks like:
>>
>> G1  G2  G3   G
>> 0.5  0.3  0.2   1
>> 0.7  0.2  0.1   1
>> 0.3  0.5  0.2   2
>> 0.3  0.3  0.4   3
>> .
>> .
>> .
>>
>> Note that for each row, the sum of G1,G2, and G3 equals1.0. I wrote
>> the following code to create variable G.
>>
>> IF  (G1>G2 and G1>G3) G=3.
>> IF  (G2>G1 and G2>G3) G=2.
>> IF  (G1>G2 and G1>G3) G=1.
>> EXECUTE.
>>
>> Obviously this code does not assign values to G when there are
>> multiple groups with the highest proportion. For example,
>>
>> G1=.40,G2=.40, G3=0.2
>> G1=.50,G3=.50, G3=0.0
>> etc.
>>
>> In these situations, I would like to RANDOMLY assign the case to one
>> of the equally high categories. In other words, for
>>
>> G1=.40, G2=.40, G3=.20
>>
>> I would like there to be a 50/50 chance that G would be assigned a
>> value of 1 or 2.
>>
>> Any help writing the code to accomplish this task would be most
>> appreciated.
>>
>> Thanks,
>>
>> Ryan
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Code-for-Random-Assignment-tp3368053p3368310.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD