|
I have a dataset that has 15 individual varialbes that are summed together to create an overall risk score.
I would like to create a series of risk scores that include a random selection of those 15 variables. For instance a risk score that is made up of 10 of the 15 variables where the variables are chosen at random for each case.
How can I go about doing this?
Thanks
Chris
|
|
Hi!
Something like this? This sums up 3 randomly chosen vars from a list of 5 vars. The example is nonsensical, but it just demonstrates one way to do it. GET FILE = 'c:/program files/spss/employee data.sav'. SET MPRINT = ON. BEGIN PROGRAM. import random, spss def selvars (vars, size, outvar): if size <= len (vars): samp = ", ".join(random.sample(vars, size)) spss.Submit("compute %s = sum(%s)." % (outvar, samp) else: print "--> Error: Sample >= Population!" selvars (vars = ['id', 'jobcat', 'jobtime', 'educ', 'minority'], size = 3, outvar = score) END PROGRAM. --- On Thu, 4/16/09, Christopher Lowenkamp <[hidden email]> wrote: > From: Christopher Lowenkamp <[hidden email]> > Subject: Random selecting variables for a score > To: [hidden email] > Date: Thursday, April 16, 2009, 4:41 AM > I have a dataset that has 15 > individual varialbes that are summed together to create an > overall risk score. > > I would like to create a series of risk scores that > include a random selection of those 15 variables. For > instance a risk score that is made up of 10 of the 15 > variables where the variables are chosen at random for each > case. > > > How can I go about doing this? > > Thanks > Chris > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Christopher Lowenkamp-2
First, a question for anyone: How does the output and code in my postings
look? Now that the list takes HTML, I've been putting code and output in
a small fixed-pitch font, instead of unformatted. Is it
readable?
At 10:41 PM 4/15/2009, Christopher Lowenkamp wrote: I have a dataset that has 15 variables that are summed together to create an overall risk score. I would like to create a series of risk scores that include a random selection of those 15 variables. For instance a risk score that is made up of 10 of the 15 variables where the variables are chosen at random for each case. The following code creates a risk score as the sum of 3 the 5 input variables, the 3 randomly selected independently for each case. (Albert-Jan, I'm not sure I'm reading your Python code right. What SPSS does it expand into? Does it select a different set of components for each case?) Test data: |-----------------------------|---------------------------| |Output Created |16-APR-2009 22:58:08 | |-----------------------------|---------------------------| CaseID IndV1 IndV2 IndV3 IndV4 IndV5 001 -1.14 2.90 .22 .05 4.22 002 1.57 .36 2.18 .44 2.80 003 5.33 -1.01 .16 2.08 2.80 004 2.80 5.08 -1.37 1.05 2.60 005 1.83 3.29 3.60 .61 2.54 006 3.00 1.22 -.25 3.91 1.91 007 2.12 1.44 1.51 1.07 4.12 008 5.87 .56 1.69 .00 1.67 Number of cases read: 8 Number of cases listed: 8 Code and output: * ..... The list of variables that may contribute to the ..... . * ..... risk score. ..... . VECTOR IndV=IndV1 TO IndV5. * ..... Flag variables, indicating whether the corresponding..... . * ..... input variable contributes to the risk score. ..... . * ..... Omit, if this is not needed. ..... . VECTOR Ctr(5,F2). NUMERIC RiskScore (F6.2). COMPUTE RiskScore = 0. * ..... Select 3 of the 5 independent variables, using ..... . * ..... the 'k/n/ sampling algorithm ..... . COMPUTE #K = 3 /* Sample size */. COMPUTE #N = 5 /* Population size */. LOOP #Idx = 1 TO 5. * ... "UseIt" indicates whether or not this variable ...... . * ... contributes to the risk score. ...... . * ... Randomly select #UseIt: ...... . . COMPUTE #UseIt = RV.BERNOULLI(#K/#N). . COMPUTE #K = #K - #UseIt. . COMPUTE #N = #N -1. * ... If desired, set a permanent flag indicating whether ...... . * ... or not this variable contributes to the score. ...... . . COMPUTE Ctr(#Idx) = #UseIt. * ... Update the risk score, adding the current ....... . * ... variable's value if it is to contribute to the ....... . * ... risk score. ....... . * ... SUM is used instead of simple "+", so missing ....... . * ... input values don't cause the result to be missing. ....... . . IF #UseIt RiskScore = SUM(RiskScore,IndV(#Idx)). END LOOP. TEMPORARY. STRING SPACE(A8). LIST. List |-----------------------------|---------------------------| |Output Created |16-APR-2009 22:58:10 | |-----------------------------|---------------------------| Cas Ct eID IndV1 IndV2 IndV3 IndV4 IndV5 r1 Ctr2 Ctr3 Ctr4 Ctr5 RiskScore 001 -1.14 2.90 .22 .05 4.22 0 1 0 1 1 7.18 002 1.57 .36 2.18 .44 2.80 0 0 1 1 1 5.42 003 5.33 -1.01 .16 2.08 2.80 1 0 0 1 1 10.21 004 2.80 5.08 -1.37 1.05 2.60 1 0 0 1 1 6.45 005 1.83 3.29 3.60 .61 2.54 1 0 1 1 0 6.04 006 3.00 1.22 -.25 3.91 1.91 0 1 1 0 1 2.88 007 2.12 1.44 1.51 1.07 4.12 1 1 1 0 0 5.07 008 5.87 .56 1.69 .00 1.67 1 0 1 1 0 7.56 Number of cases read: 8 Number of cases listed: 8 ============================= APPENDIX: Test data, and code ============================= * C:\Documents and Settings\Richard\My Documents . * \Technical\spssx-l\Z-2009b . * \2009-04-15 Lowenkamp - Random selecting variables for a score.SPS. * In response to posting . * Date: Wed, 15 Apr 2009 22:41:49 -0400 . * From: Christopher Lowenkamp <[hidden email]> . * Subject: Random selecting variables for a score . * To: [hidden email] . * "I have 15 varialbes that are summed to create an overall risk . * score. I would like to create a series of risk scores that . * include a random selection of those 15 variables, 10 of the 15 . * variables where the variables are chosen at random for each . * case." . * ................................................................. . * ................. Test data ..................... . SET RNG = MT /* 'Mersenne twister' random number generator */ . SET MTINDEX = 6111 /* Providence, RI telephone book */ . INPUT PROGRAM. . NUMERIC CaseID (N3). . VECTOR IndV(5,F6.2). . LOOP CaseID = 1 TO 8. . LOOP #Idx = 1 TO 5. . COMPUTE IndV(#Idx) = RV.NORMAL(2,2). . END LOOP. . END CASE. . END LOOP. END FILE. END INPUT PROGRAM. LIST. * ................................................................. . * ................. Logic ..................... . * ..... The list of variables that may contribute to the ..... . * ..... risk score. ..... . VECTOR IndV=IndV1 TO IndV5. * ..... Flag variables, indicating whether the corresponding..... . * ..... input variable contributes to the risk score. ..... . * ..... Omit, if this is not needed. ..... . VECTOR Ctr(5,F2). NUMERIC RiskScore (F6.2). COMPUTE RiskScore = 0. * ..... Select 3 of the 5 independent variables, using ..... . * ..... the 'k/n/ sampling algorithm ..... . COMPUTE #K = 3 /* Sample size */. COMPUTE #N = 5 /* Population size */. LOOP #Idx = 1 TO 5. * ... "UseIt" indicates whether or not this variable ...... . * ... contributes to the risk score. ...... . * ... Randomly select #UseIt: ...... . . COMPUTE #UseIt = RV.BERNOULLI(#K/#N). . COMPUTE #K = #K - #UseIt. . COMPUTE #N = #N -1. * ... If desired, set a permanent flag indicating whether ...... . * ... or not this variable contributes to the score. ...... . . COMPUTE Ctr(#Idx) = #UseIt. * ... Update the risk score, adding the current ....... . * ... variable's value if it is to contribute to the ....... . * ... risk score. ....... . * ... SUM is used instead of simple "+", so missing ....... . * ... input values don't cause the result to be missing. ....... . . IF #UseIt RiskScore = SUM(RiskScore,IndV(#Idx)). END LOOP. TEMPORARY. STRING SPACE(A8). LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
