seeking help compute a new variable - combine 2 values

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

seeking help compute a new variable - combine 2 values

thara vardhan-2
Dear List Members

I would be I would be grateful if any member could help me combine two
variable values in one column.

For example:

I have two variables   ERefnum  Personcni

                                                  1234    78910

How do I create a new variable with the result as  123478910?

many thanks
Thara Vardhan
Senior Statistician
Performance Improvement & Planning

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information contained in this email is intended for the named recipient(s)
only. It may contain private, confidential, copyright or legally privileged
information.  If you are not the intended recipient or you have received this
email by mistake, please reply to the author and delete this email immediately.
You must not copy, print, forward or distribute this email, nor place reliance
on its contents. This email and any attachment have been virus scanned. However,
you are requested to conduct a virus scan as well.  No liability is accepted
for any loss or damage resulting from a computer virus, or resulting from a delay
or defect in transmission of this email or any attached file. This email does not
constitute a representation by the NSW Police Force unless the author is legally
entitled to do so.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: seeking help compute a new variable - combine 2 values

Albert-Jan Roskam
if they're both string variables, use CONCAT:
string combi (a50).
compute combi = concat(refnum, person).

if not, you have to make to scratch string equivalents of the two numerical vars, and concatenate those.

Cheers!!
Albert-Jan


--- On Thu, 12/11/08, Thara Vardhan <[hidden email]> wrote:

> From: Thara Vardhan <[hidden email]>
> Subject: seeking help compute a new variable - combine 2 values
> To: [hidden email]
> Date: Thursday, December 11, 2008, 6:19 AM
> Dear List Members
>
> I would be I would be grateful if any member could help me
> combine two
> variable values in one column.
>
> For example:
>
> I have two variables   ERefnum  Personcni
>
>                                                   1234
> 78910
>
> How do I create a new variable with the result as
> 123478910?
>
> many thanks
> Thara Vardhan
> Senior Statistician
> Performance Improvement & Planning
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _ _ _ _ _ _ _ _ _ _ _
>
> The information contained in this email is intended for the
> named recipient(s)
> only. It may contain private, confidential, copyright or
> legally privileged
> information.  If you are not the intended recipient or you
> have received this
> email by mistake, please reply to the author and delete
> this email immediately.
> You must not copy, print, forward or distribute this email,
> nor place reliance
> on its contents. This email and any attachment have been
> virus scanned. However,
> you are requested to conduct a virus scan as well.  No
> liability is accepted
> for any loss or damage resulting from a computer virus, or
> resulting from a delay
> or defect in transmission of this email or any attached
> file. This email does not
> constitute a representation by the NSW Police Force unless
> the author is legally
> entitled to do so.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body
> text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the
> command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Creating unique random numbers in Python

Albert-Jan Roskam
In reply to this post by thara vardhan-2
Hi,

Mainly for FUN, I am making a small script that produces two variables: Social Security Number (SSN) and a unique random number (RND). The SSN  conforms to the modulus 11 rule. See the script below.

My question:
How do I make the solution below more scalable? Above a certain dataset size, the rnd/ssn lists are getting too big even to be held in virtual memory. For example, is there a Python function that generates *unique* random numbers (i.e., without replacement)? That way I can get rid of one list already? I am now using a 'set' to to make a list of random numbers unique. More generally, I am hoping to apply generator functions as I hope to learn more about how to use them.

Cheers!!
Albert-Jan

"""
Create a list of SSNs (or BSNs, as they are called here)
which obey the Modulus-11 Rule, and also create a list
of unique random integers.
"""

import os, random
size   = 10**4  # desired dataset size
offset = 10**8  # offset for bsn number
myfile = "d:/temp/random.txt"
if os.path.exists(myfile): os.remove(myfile)
f_out  = open(myfile, "ab")
rndL   = set([random.randint(0, 10**9) for i in range (1, size)])
bsnL   = []
f_out.write("bsn      \trnd\r\n")
for i in range(1, 10**6):  # not a very elegant solution.
    bsn = str(i + offset).zfill(9)
    # add to valid BSN list if Modulus 11 rule is True.
    if (sum([(int(bsn[i]))* (9 - i) for i in range(0, 8)]) % 11) == int(bsn[8]):
        bsnL.append(bsn)
tbl = dict ((bsn, rnd) for bsn, rnd in zip(bsnL, rndL))
cnt = 0
for bsn, rnd in tbl.iteritems():
    wrt = str(bsn) + "\t" + str(rnd).zfill(9) + "\r\n"
    cnt += 1
    if cnt % 1000 == 0:
        print "--> Writing line %(cnt)s: %(bsn)s, %(rnd)s" % (locals())
    f_out.write(wrt)
f_out.close()
print "--> Done!"

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating unique random numbers in Python

Peck, Jon
The usual random number generators will not repeat a value until they exhaust their period, after which the entire sequence repeats.  The period varies according to the type of generator, but they all will generate far more nonrepeating values than you will ever have the patience to wait for.  The Python random module contains a large set of generators and includes a seed function if you want to start at a deterministic point.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Albert-jan Roskam
Sent: Thursday, December 11, 2008 1:54 AM
To: [hidden email]
Subject: [SPSSX-L] Creating unique random numbers in Python

Hi,

Mainly for FUN, I am making a small script that produces two variables: Social Security Number (SSN) and a unique random number (RND). The SSN  conforms to the modulus 11 rule. See the script below.

My question:
How do I make the solution below more scalable? Above a certain dataset size, the rnd/ssn lists are getting too big even to be held in virtual memory. For example, is there a Python function that generates *unique* random numbers (i.e., without replacement)? That way I can get rid of one list already? I am now using a 'set' to to make a list of random numbers unique. More generally, I am hoping to apply generator functions as I hope to learn more about how to use them.

Cheers!!
Albert-Jan

"""
Create a list of SSNs (or BSNs, as they are called here)
which obey the Modulus-11 Rule, and also create a list
of unique random integers.
"""

import os, random
size   = 10**4  # desired dataset size
offset = 10**8  # offset for bsn number
myfile = "d:/temp/random.txt"
if os.path.exists(myfile): os.remove(myfile)
f_out  = open(myfile, "ab")
rndL   = set([random.randint(0, 10**9) for i in range (1, size)])
bsnL   = []
f_out.write("bsn      \trnd\r\n")
for i in range(1, 10**6):  # not a very elegant solution.
    bsn = str(i + offset).zfill(9)
    # add to valid BSN list if Modulus 11 rule is True.
    if (sum([(int(bsn[i]))* (9 - i) for i in range(0, 8)]) % 11) == int(bsn[8]):
        bsnL.append(bsn)
tbl = dict ((bsn, rnd) for bsn, rnd in zip(bsnL, rndL))
cnt = 0
for bsn, rnd in tbl.iteritems():
    wrt = str(bsn) + "\t" + str(rnd).zfill(9) + "\r\n"
    cnt += 1
    if cnt % 1000 == 0:
        print "--> Writing line %(cnt)s: %(bsn)s, %(rnd)s" % (locals())
    f_out.write(wrt)
f_out.close()
print "--> Done!"

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD