converting zipcodes as strings to numeric

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

converting zipcodes as strings to numeric

Michael Tunik
Dear SPSSX-L listserve group

I have a large data set N=20,000.

Each individual record has a field "zip code" which contains either a 5 character string (ex. 11733, or a 5 plus 4 character string separated by a dash (ex 11733-4567).

The zip code field is not a numeric but a string variable.

Is there a way to recode the string zip code into a new field zip code numeric, that retains the same order of numbers, but converts them into a numeric variable?

Thank you so much for your suggestions.

Michael Tunik, MD
Associate Professor of Emergency Medicine and Pediatrics
NYU Grossman School of Medicine

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Maguin, Eugene
Yes, but you have two cases: zip and zip+4. You haven't told us how you want to treat the zip+4 values. I'll say you want to discard the +4 component. The variable, I'll call it zipcode, should a10 or longer.
Compute zipnumeric=number(char.substr(zipcode,1,5),f5.0).

Gene Maguin







-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Michael Tunik
Sent: Saturday, October 10, 2020 2:54 PM
To: [hidden email]
Subject: converting zipcodes as strings to numeric

Dear SPSSX-L listserve group

I have a large data set N=20,000.

Each individual record has a field "zip code" which contains either a 5 character string (ex. 11733, or a 5 plus 4 character string separated by a dash (ex 11733-4567).

The zip code field is not a numeric but a string variable.

Is there a way to recode the string zip code into a new field zip code numeric, that retains the same order of numbers, but converts them into a numeric variable?

Thank you so much for your suggestions.

Michael Tunik, MD
Associate Professor of Emergency Medicine and Pediatrics NYU Grossman School of Medicine

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Jon Peck
In reply to this post by Michael Tunik
How about this? It shifts the 5 digit zips over by 4 places to be consistent with the 9-digit zips, but you can just remove the multiplication in that compute for 5-digit zips if you want to leave those values as is.
I assigned N format to the result, which eliminates comma separators and decimal points.

data list list /zip(a10).
begin data.
11733
11733-4567
end data.
compute #dash = char.index(zip, "-").
do if #dash gt 0.
compute nzip = number(char.substr(zip,1,5), F5.0) * 10000 +number(char.substr(zip, 7), F4.0).
else.
compute nzip = number(zip, F5.0) * 10000.
end if.
format nzip(N9).
exec.

On Sat, Oct 10, 2020 at 12:53 PM Michael Tunik <[hidden email]> wrote:
Dear SPSSX-L listserve group

I have a large data set N=20,000.

Each individual record has a field "zip code" which contains either a 5 character string (ex. 11733, or a 5 plus 4 character string separated by a dash (ex 11733-4567).

The zip code field is not a numeric but a string variable.

Is there a way to recode the string zip code into a new field zip code numeric, that retains the same order of numbers, but converts them into a numeric variable?

Thank you so much for your suggestions.

Michael Tunik, MD
Associate Professor of Emergency Medicine and Pediatrics
NYU Grossman School of Medicine

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Art Kendall
Just curious.
Others have pointed out ways to do this.  I have never seen Zipcodes used
arithmetically.

Why is it a problem if it is a string field?





-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Michael Tunik
In reply to this post by Jon Peck
Thank you for your advice.
I will try this approach and get back to you as soon as possible with results.
Best
Mike Tunik

On Sat, Oct 10, 2020, 3:43 PM Jon Peck <[hidden email]> wrote:
How about this? It shifts the 5 digit zips over by 4 places to be consistent with the 9-digit zips, but you can just remove the multiplication in that compute for 5-digit zips if you want to leave those values as is.
I assigned N format to the result, which eliminates comma separators and decimal points.

data list list /zip(a10).
begin data.
11733
11733-4567
end data.
compute #dash = char.index(zip, "-").
do if #dash gt 0.
compute nzip = number(char.substr(zip,1,5), F5.0) * 10000 +number(char.substr(zip, 7), F4.0).
else.
compute nzip = number(zip, F5.0) * 10000.
end if.
format nzip(N9).
exec.

On Sat, Oct 10, 2020 at 12:53 PM Michael Tunik <[hidden email]> wrote:
Dear SPSSX-L listserve group

I have a large data set N=20,000.

Each individual record has a field "zip code" which contains either a 5 character string (ex. 11733, or a 5 plus 4 character string separated by a dash (ex 11733-4567).

The zip code field is not a numeric but a string variable.

Is there a way to recode the string zip code into a new field zip code numeric, that retains the same order of numbers, but converts them into a numeric variable?

Thank you so much for your suggestions.

Michael Tunik, MD
Associate Professor of Emergency Medicine and Pediatrics
NYU Grossman School of Medicine

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Michael Tunik
In reply to this post by Art Kendall
Thanks for your question.
We will be matching our data base zip codes with others in another local data base. These will be numeric and I did not think they could be matched if the zip code fields were of different types.
Best
Mike Tunik

On Sat, Oct 10, 2020, 5:16 PM Art Kendall <[hidden email]> wrote:
Just curious.
Others have pointed out ways to do this.  I have never seen Zipcodes used
arithmetically.

Why is it a problem if it is a string field?





-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Art Kendall
If you are doing the matching in SPSS, you could change either source.

Do both sources have a mix of complete and incomplete ZIPs?  


There could be a mismatch if one source has a complete zip+4 and the other
source has an incomplete code. Creating 2 new temporary variables for the
matching process might make help thinking it out.  Two variables would help
when useing human eyeballs for some quality assurance checking.





-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Michael Tunik
In reply to this post by Maguin, Eugene
Gene, thank you for the advice.
I am not sure why this did not work.  I ran this syntax and it converted the 5 and 10 character zip codes to system missing fields.
Best
Mike Tunik

On Sat, Oct 10, 2020 at 3:30 PM Maguin, Eugene <[hidden email]> wrote:
Yes, but you have two cases: zip and zip+4. You haven't told us how you want to treat the zip+4 values. I'll say you want to discard the +4 component. The variable, I'll call it zipcode, should a10 or longer.
Compute zipnumeric=number(char.substr(zipcode,1,5),f5.0).

Gene Maguin







-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Michael Tunik
Sent: Saturday, October 10, 2020 2:54 PM
To: [hidden email]
Subject: converting zipcodes as strings to numeric

Dear SPSSX-L listserve group

I have a large data set N=20,000.

Each individual record has a field "zip code" which contains either a 5 character string (ex. 11733, or a 5 plus 4 character string separated by a dash (ex 11733-4567).

The zip code field is not a numeric but a string variable.

Is there a way to recode the string zip code into a new field zip code numeric, that retains the same order of numbers, but converts them into a numeric variable?

Thank you so much for your suggestions.

Michael Tunik, MD
Associate Professor of Emergency Medicine and Pediatrics NYU Grossman School of Medicine

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Michael G. Tunik
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Michael Tunik
In reply to this post by Jon Peck
Hi Jon
Thank you for the advice and the SPSS syntax.
I ran the syntax and it did convert the 5 and 10 character zip codes to a 5 character zip code with 2 decimal places.
Very helpful!
Best
Mike Tunik 

On Sat, Oct 10, 2020 at 3:43 PM Jon Peck <[hidden email]> wrote:
How about this? It shifts the 5 digit zips over by 4 places to be consistent with the 9-digit zips, but you can just remove the multiplication in that compute for 5-digit zips if you want to leave those values as is.
I assigned N format to the result, which eliminates comma separators and decimal points.

data list list /zip(a10).
begin data.
11733
11733-4567
end data.
compute #dash = char.index(zip, "-").
do if #dash gt 0.
compute nzip = number(char.substr(zip,1,5), F5.0) * 10000 +number(char.substr(zip, 7), F4.0).
else.
compute nzip = number(zip, F5.0) * 10000.
end if.
format nzip(N9).
exec.

On Sat, Oct 10, 2020 at 12:53 PM Michael Tunik <[hidden email]> wrote:
Dear SPSSX-L listserve group

I have a large data set N=20,000.

Each individual record has a field "zip code" which contains either a 5 character string (ex. 11733, or a 5 plus 4 character string separated by a dash (ex 11733-4567).

The zip code field is not a numeric but a string variable.

Is there a way to recode the string zip code into a new field zip code numeric, that retains the same order of numbers, but converts them into a numeric variable?

Thank you so much for your suggestions.

Michael Tunik, MD
Associate Professor of Emergency Medicine and Pediatrics
NYU Grossman School of Medicine

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]



--
Michael G. Tunik
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Michael Tunik
In reply to this post by Art Kendall
Hi Art
Thank you very much for the advice.
I do not believe the zip codes will need to be the 4 digit extension,just the 5 digit codes.
Once I have the zip codes I need to match with the ones in the patient data set, I will have a better idea of what is needed for the next step.
Best
Mike Tunik



On Mon, Oct 12, 2020 at 11:02 AM Art Kendall <[hidden email]> wrote:
If you are doing the matching in SPSS, you could change either source.

Do both sources have a mix of complete and incomplete ZIPs? 


There could be a mismatch if one source has a complete zip+4 and the other
source has an incomplete code. Creating 2 new temporary variables for the
matching process might make help thinking it out.  Two variables would help
when useing human eyeballs for some quality assurance checking.





-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Michael G. Tunik
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Art Kendall
*If you only need the short ZIP this should work.
data list list /zip(a10).
begin data.
11733
11733-4567
01234
end data.
Numeric nzip (n5).
compute nzip = number(char.substr(zip,1,5), N5).
LIST.


Consider using FIND DUPLICATE CASES a few time to see how well you are
locating you duplicates.

Or if you are using MATCH Files be sure to check on match failures and on
false matches.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Michael Tunik
Hi Art
I tried this, and the result is SPSS creates a separate data base, with only 3 zip codes 11733, 11733-4567, 01234.
These show up as 2 variables, the first variable is a string variable, and a second variable which is a restricted numeric both have the same 3 zip codes. 
I am not sure why this syntax is not converting the zip string variable in the primary data base to an nzip numeric.
Would appreciate any thoughts.
Thank you
Best
Mike Tunik


On Mon, Oct 19, 2020 at 9:45 AM Art Kendall <[hidden email]> wrote:
*If you only need the short ZIP this should work.
data list list /zip(a10).
begin data.
11733
11733-4567
01234
end data.
Numeric nzip (n5).
compute nzip = number(char.substr(zip,1,5), N5).
LIST.


Consider using FIND DUPLICATE CASES a few time to see how well you are
locating you duplicates.

Or if you are using MATCH Files be sure to check on match failures and on
false matches.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Michael G. Tunik
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

David Marso-2
In reply to this post by Michael Tunik
How can that possibly be?
Run the transformation code on your own data rather than the sample mockup.
Face palm time.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting zipcodes as strings to numeric

Art Kendall
In reply to this post by Michael Tunik
* this part is only to have some example data to see whether the concept
works.
* it only touches the data it creates and not your data.
* it only create 1 string variable
*If you only need the short ZIP this should work.
data list list /zip(a10).
begin data.
11733
11733-4567
01234
end data.

* this part is suggested syntax to (a) create a numeric variable and (b)
COMPUTE a new variable.
* when the LIST command passes the data.

Numeric nzip (n5).
compute nzip = number(char.substr(zip,1,5), N5).

* the LIST command simple lists the 2 variables.  The string variable read
in from the syntax and the
* numeric variable that is COMPUTEd.
LIST.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants