I would like to create a random number based on a list of ID numbers, in this case, a Social Security Number. File lists the same ID number for each of an individual’s different service period. I have tried to compute unique ID using COMPUTE uniqueID=RV.UNIFORM(100000,400000). EXECUTE. but I end up with a different number for each line. What I would like to get is something like this so that I can link an SSN to one number: SSN random number 123456789 random number 1 123456789 random number 1 123456789 random number 1 123456789 random number 1 987654321 random number 2 456789123 random number 3 456789123 random number 3 456789123 random number 3 456789123 random number 3 999999999 random number 4 Thank you so much in advance. I really appreciate being able to come to this group and get help. Vicki L. Stirkey OMHSAS l Bureau of Quality Management and Data Review 112 East Azalea Drive l Hbg PA 17110 Phone: 717.705.8198 l Fax: 717.772.6737 |
Hello,
Perhaps with a table lookup, so you first create a file with id-encrypted id.
set rng=mt mtindex=43210.
dataset name target.
begin program.
import spss spss.SetMacroValue("!ncases", spss.GetCaseCount()) end program. input program.
+numeric id uniqueID (n9). +loop id = 1 to !ncases. + compute uniqueID=trunc(rv.uniform(0,10**7)). + end case. +end loop. +end file. end input program. execute. dataset name target. * aggregate outfile = * /break = uniqueID /n = n. /* check for duplicates. match files /table = source /file = target /by id. execute. Regards,
Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
In reply to this post by vstirkey
Vicki, you can use lags the exact same way as you requested the other day for this. Example below.
*******************************************. data list free / SSN. begin data 123456789 123456789 123456789 123456789 987654321 456789123 456789123 456789123 456789123 999999999 end data. DO IF $casenum = 1 or SSN <> lag(SSN). compute uniqueID=TRUNC(RV.UNIFORM(100000,400000)). ELSE IF SSN = lag(SSN). compute uniqueID=lag(uniqueID). END IF. exe. *******************************************. If you are confused by this feel free - it is better to learn what is going on than ask the list for small variants repeatedly. Also note that this does not guarantee the id will be unique anyway! Best to check afterwards that this is indeed the case. (Also note I truncated the value, I would prefer not to have unique ids as full precision numeric values, in case of some truncation later on down the line). |
In reply to this post by vstirkey
That is the nature of random number generation.
If your goal is to scramble ssn's but preserve the structure, you
can use the SPSSINC ANON extension command. You need the Python Essentials
and the this extension command available from the SPSS Community website
for this (www.ibm.com/developerworks/spssdevcentral).
Here is an example: SPSSINC ANON VARIABLES = ssn /OPTIONS ONETOONE =ssn METHOD=RANDOM. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Stirkey, Vicki" <[hidden email]> To: [hidden email], Date: 06/05/2013 08:41 AM Subject: [SPSSX-L] Create random number Sent by: "SPSSX(r) Discussion" <[hidden email]> I would like to create a random number based on a list of ID numbers, in this case, a Social Security Number. File lists the same ID number for each of an individual’s different service period. I have tried to compute unique ID using COMPUTE uniqueID=RV.UNIFORM(100000,400000). EXECUTE. but I end up with a different number for each line. What I would like to get is something like this so that I can link an SSN to one number: SSN random number 123456789 random number 1 123456789 random number 1 123456789 random number 1 123456789 random number 1 987654321 random number 2 456789123 random number 3 456789123 random number 3 456789123 random number 3 456789123 random number 3 999999999 random number 4 Thank you so much in advance. I really appreciate being able to come to this group and get help. Vicki L. Stirkey OMHSAS l Bureau of Quality Management and Data Review 112 East Azalea Drive l Hbg PA 17110 Phone: 717.705.8198 l Fax: 717.772.6737 www.dpw.state.pa.us |
In reply to this post by Andy W
First:
sort cases by ssn. Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: Andy W <[hidden email]> To: [hidden email], Date: 06/05/2013 10:12 AM Subject: Re: Create random number Sent by: "SPSSX(r) Discussion" <[hidden email]> Vicki, you can use lags the exact same way as you requested the other day for this. Example below. *******************************************. data list free / SSN. begin data 123456789 123456789 123456789 123456789 987654321 456789123 456789123 456789123 456789123 999999999 end data. DO IF $casenum = 1 or SSN <> lag(SSN). compute uniqueID=TRUNC(RV.UNIFORM(100000,400000)). ELSE IF SSN = lag(SSN). compute uniqueID=lag(uniqueID). END IF. exe. *******************************************. If you are confused by this feel free - it is better to learn what is going on than ask the list for small variants repeatedly. Also note that this does not guarantee the id will be unique anyway! Best to check afterwards that this is indeed the case. (Also note I truncated the value, I would prefer not to have unique ids as full precision numeric values, in case of some truncation later on down the line). ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Create-random-number-tp5720583p5720585.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Andy W
The SPSSINC ANON extension command guarantees
uniqueness for the RANDOM method when used with the ONETOONE keyword.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Andy W <[hidden email]> To: [hidden email], Date: 06/05/2013 09:11 AM Subject: Re: [SPSSX-L] Create random number Sent by: "SPSSX(r) Discussion" <[hidden email]> Vicki, you can use lags the exact same way as you requested the other day for this. Example below. *******************************************. data list free / SSN. begin data 123456789 123456789 123456789 123456789 987654321 456789123 456789123 456789123 456789123 999999999 end data. DO IF $casenum = 1 or SSN <> lag(SSN). compute uniqueID=TRUNC(RV.UNIFORM(100000,400000)). ELSE IF SSN = lag(SSN). compute uniqueID=lag(uniqueID). END IF. exe. *******************************************. If you are confused by this feel free - it is better to learn what is going on than ask the list for small variants repeatedly. Also note that this does not guarantee the id will be unique anyway! Best to check afterwards that this is indeed the case. (Also note I truncated the value, I would prefer not to have unique ids as full precision numeric values, in case of some truncation later on down the line). ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Create-random-number-tp5720583p5720585.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |