zip code distance

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

zip code distance

paul wilson-7
I have a database of customers that includes their address and zip code.
I'd like to calculate each customer's distance from the store so each customer in the database
has the distance in either miles or km next to his/her name.
I'm guessing it's possible to somehow calculate distance between zip codes.
Can this be done in SPSS?
If so I'd greatly appreciate if someone can explain how and provide an example code.

Thanks a lot!

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: zip code distance

Nagaraju Thogiti
Hi Paul,

If you can populate the longitude and latitude of the zip code in the
dataset, then the distance calculation becomes a trivial trigonometric
formula.

Here is small script (SQL) that calculates the distance between two
zipcodes:

-- sets average radius of earth in miles
@radius = 3958.75

@a = sqrt(square(sin((@latitude2-@latitude1)/2.0e)) +
        (cos(@latitude1) * cos(@latitude2) * square(sin((@longitude2-
@longitude1)/2.0e))) )

@distance = @radius * ( 2.0e *asin(case when 1.0e < @a then 1.0e else @a
end ))

If you don't have longitude and latitude data, you can always download
from Google and map it back to your dataset.

If you can't find it, please send me an email, I'll send an Excel file
with this data.

Thank you,
Nagu

Nagaraju Thogiti | Data Mining Analyst
[hidden email]
Tel: (310) 354-5600 x 4885
Fax: (310)  630-3142
AF Services, LLC.
www.pcmall.com


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
paul wilson
Sent: Thursday, October 16, 2008 5:10 PM
To: [hidden email]
Subject: zip code distance

I have a database of customers that includes their address and zip code.
I'd like to calculate each customer's distance from the store so each
customer in the database
has the distance in either miles or km next to his/her name.
I'm guessing it's possible to somehow calculate distance between zip
codes.
Can this be done in SPSS?
If so I'd greatly appreciate if someone can explain how and provide an
example code.

Thanks a lot!

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: zip code distance

Richard Ristow
In reply to this post by paul wilson-7
At 08:09 PM 10/16/2008, paul wilson wrote:

>I have a database of customers that includes
>their address and zip code. I'd like to
>calculate each customer's distance from the
>store so each customer in the database has the
>distance in either miles or km next to his/her
>name. I'm guessing it's possible to somehow
>calculate distance between zip codes.

There are two parts to this: First, finding
'where' a ZIP code is, i.e. the latitude and
longitude of its centroid; then, calculating
distance between two points, with their latitude and longitude given.

For a database of ZIP centroids, I've Googled,
but you can follow up on this as well as I can.
There are various commercial databases.

Looking for free ones, I found (but did not explore),

http://geocoder.ibegin.com/downloads.php offering for download,

>Zip Codes
>
>File of all US ZIP codes. Includes zip, city,
>state, latitude, longitude, and county (Last Updated: January 18, 2008):

http://spatialnews.geocomm.com/newsletter/2000/jan/zipcodes.html

A list of sources, with a cover article that states,

>"Unfortunately, free postal data in a useable
>format is very tough to come by. We've put
>together this collection of resources to help
>you out. Commercial data is available, however,
>don't expect to get it for free - lots of work
>has gone into developing this data and keeping it up to date."


And, see what you can find at http://www.stumbleupon.com/tag/zip-code-database/

....................
Nagaraju Thogiti posted a SQL calculation to find
the distance between two points, with latitude
and longitude given. This can also be done in
SPSS transformation language(1), (2); the code
below has not been tested since editing. There's
also a more accurate Python solution on Developer Central(4).

*  Initialize constants  ............  .
DO IF $CASENUM EQ 1.
*  #EarthRad is the Earth's radius in whatever units you please; .
*  the calculated distance will be in those units. Radius (3):   .
*              6,372.7976 km,
*              3,959.873  statute miles,
*              3,441.035  nautical miles.
.  COMPUTE
    #EarthRad = 6,372.7976 /* km */.

*  #AngleCvt is the number of your angle units (degrees) in      .
*  one of SPSS's angle units (radians). It uses that ARCTAN(1)  .
*  is PI/4 radians.                                              .
    #AngleCvt =  360  /* Number of input units in a full circle */.
               /(8*ARCTAN(1)).

END IF.


*  Compute distance between points with coordinates              .
*  (lat1,lon1) and (lat2,lon2)                                   .

compute distance = #EarthRad*
                     arcos(sin(lat1/#AngleCvt)*sin(lat2/#AngleCvt)
                         + cos(lat1/#AngleCvt)*cos(lat2/#AngleCvt
                           *cos(lon2/#AngleCvt-lon1/#AngleCvt)).


--------------------
(1)Date: Tue, 12 Aug 2008 15:56:48 -0400
From:    Richard Ristow <[hidden email]>
Subject: Re: Calculating Distance using Latitude and Longitude
To:      [hidden email]

(2)Date: Thu, 31 May 2007 11:35:53 -0400
From:    Richard Ristow <[hidden email]>
Subject: Re: Function for arc cosine
To:      [hidden email]

(3) From http://en.wikipedia.org/wiki/Earth_radius:
Quadratic mean radius: Qr

"The ellipsoidal quadratic mean radius provides
the best approximation of Earth's average
transverse meridional radius and radius of
curvature ... This is the equivalent spherical
"great-circle" radius of the ellipsoid).
For Earth, Qr equals 6,372.7976 km (˜3,959.873 mi; ˜3,441.035 nmi)."

See also http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html)

(4) Date: Mon, 11 Aug 2008 13:36:19 -0500
From:     "Peck, Jon" <[hidden email]>
Subject:  Re: SPSS Syntax Calculating Distance
using Latitude and               Longitude
To: [hidden email]

"In the extendedTransforms.py programmability
module on SPSS Developer Central
(www.spss.com/devcentral) there are two functions
for this: sphDist and ellipseDist. They take
latitude and longitude values in degrees or in
radians and calculate the distance.

sphDist uses a spherical approximation to the
earth and ellipseDist uses an ellipsoidal
approximation, which is a bit more accurate."

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: zip code distance

Peck, Jon
There are two sets of issues with using zip codes in this way: the acquisition of the data and the technical computation are addressed by Richard, but one might wonder whether this calculation is a good idea.

What is meant by the distance of a zip code from a store?  Zip codes are not necessarily nice, compact shapes or even geometrically closed:  the Sears Tower has a nine-digit zip for each floor.  Unless the areas are acceptably far from the store (mail order?), the distance may not be well defined.  And, of course, geographical distance may not be a good indicator of travel time, if that is the latent idea.

So this enterprise really depends on what is to be made of these distances.

Regards,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow
Sent: Wednesday, October 22, 2008 9:34 PM
To: [hidden email]
Subject: Re: [SPSSX-L] zip code distance

At 08:09 PM 10/16/2008, paul wilson wrote:

>I have a database of customers that includes
>their address and zip code. I'd like to
>calculate each customer's distance from the
>store so each customer in the database has the
>distance in either miles or km next to his/her
>name. I'm guessing it's possible to somehow
>calculate distance between zip codes.

There are two parts to this: First, finding
'where' a ZIP code is, i.e. the latitude and
longitude of its centroid; then, calculating
distance between two points, with their latitude and longitude given.

For a database of ZIP centroids, I've Googled,
but you can follow up on this as well as I can.
There are various commercial databases.

Looking for free ones, I found (but did not explore),

http://geocoder.ibegin.com/downloads.php offering for download,

>Zip Codes
>
>File of all US ZIP codes. Includes zip, city,
>state, latitude, longitude, and county (Last Updated: January 18, 2008):

http://spatialnews.geocomm.com/newsletter/2000/jan/zipcodes.html

A list of sources, with a cover article that states,

>"Unfortunately, free postal data in a useable
>format is very tough to come by. We've put
>together this collection of resources to help
>you out. Commercial data is available, however,
>don't expect to get it for free - lots of work
>has gone into developing this data and keeping it up to date."


And, see what you can find at http://www.stumbleupon.com/tag/zip-code-database/

....................
Nagaraju Thogiti posted a SQL calculation to find
the distance between two points, with latitude
and longitude given. This can also be done in
SPSS transformation language(1), (2); the code
below has not been tested since editing. There's
also a more accurate Python solution on Developer Central(4).

*  Initialize constants  ............  .
DO IF $CASENUM EQ 1.
*  #EarthRad is the Earth's radius in whatever units you please; .
*  the calculated distance will be in those units. Radius (3):   .
*              6,372.7976 km,
*              3,959.873  statute miles,
*              3,441.035  nautical miles.
.  COMPUTE
    #EarthRad = 6,372.7976 /* km */.

*  #AngleCvt is the number of your angle units (degrees) in      .
*  one of SPSS's angle units (radians). It uses that ARCTAN(1)  .
*  is PI/4 radians.                                              .
    #AngleCvt =  360  /* Number of input units in a full circle */.
               /(8*ARCTAN(1)).

END IF.


*  Compute distance between points with coordinates              .
*  (lat1,lon1) and (lat2,lon2)                                   .

compute distance = #EarthRad*
                     arcos(sin(lat1/#AngleCvt)*sin(lat2/#AngleCvt)
                         + cos(lat1/#AngleCvt)*cos(lat2/#AngleCvt
                           *cos(lon2/#AngleCvt-lon1/#AngleCvt)).


--------------------
(1)Date: Tue, 12 Aug 2008 15:56:48 -0400
From:    Richard Ristow <[hidden email]>
Subject: Re: Calculating Distance using Latitude and Longitude
To:      [hidden email]

(2)Date: Thu, 31 May 2007 11:35:53 -0400
From:    Richard Ristow <[hidden email]>
Subject: Re: Function for arc cosine
To:      [hidden email]

(3) From http://en.wikipedia.org/wiki/Earth_radius:
Quadratic mean radius: Qr

"The ellipsoidal quadratic mean radius provides
the best approximation of Earth's average
transverse meridional radius and radius of
curvature ... This is the equivalent spherical
"great-circle" radius of the ellipsoid).
For Earth, Qr equals 6,372.7976 km (~3,959.873 mi; ~3,441.035 nmi)."

See also http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html)

(4) Date: Mon, 11 Aug 2008 13:36:19 -0500
From:     "Peck, Jon" <[hidden email]>
Subject:  Re: SPSS Syntax Calculating Distance
using Latitude and               Longitude
To: [hidden email]

"In the extendedTransforms.py programmability
module on SPSS Developer Central
(www.spss.com/devcentral) there are two functions
for this: sphDist and ellipseDist. They take
latitude and longitude values in degrees or in
radians and calculate the distance.

sphDist uses a spherical approximation to the
earth and ellipseDist uses an ellipsoidal
approximation, which is a bit more accurate."

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: zip code distance

Richard Ristow
In reply to this post by paul wilson-7
At 03:47 AM 10/24/2008, Downs, Clive asked, off-list:

>Thank you very much for your syntax for calculating distance between
>two points defined by latitude and longitude (posted on SPSS-X list Aug 12).
>
>I have tried this out, and there were a couple of amendments I made ...

This'll teach me to post complicated code without thorough testing.

>One was to insert a COMPUTE keyword before the computation of #AngleCvt

Sigh. An elementary syntax error.

>There still seems to be a problem: I couldn't seem to find an SPSS
>function ARCOS.

Reasonably, since there isn't one. There is, however, ARSIN.
ArcCosine can be computed using the identity

ArcCos(x) = PI/2-ArcSin(X).
or
ArcCos(x) = ArcTan(1)-ArcSin(X).

The following is tested.  The 'given' values which are compared with
the calculation are,

. Providence to Chicago distance, from Jon Peck's posting "Re:
Function for arc cosine", Thu, 7 Jun 2007 09:53:06 -0500

. Others, arbitrary test points with distance calculated at site
http://www.movable-type.co.uk/scripts/latlong.html.

I'm not sure why the tiny differences remain, but the code seems to
be essentially reliable. It could be useful to wrap it in a macro; or
in two macros, one for the initializations and one for the computation.

Earth-radius values are from
http://en.wikipedia.org/wiki/Earth_radius; see also
http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html.

List
|-----------------------------|---------------------------|
|Output Created               |28-OCT-2008 20:10:58       |
|-----------------------------|---------------------------|
[TestData]

City1    lat1    lon1 City2    lat2    lon2    Given

Pvd     41.90   87.65 Chi     41.73   71.43   836.27
A1      42.00   80.00 A2      39.00   70.00   564.33
B1      44.00   70.00 B2      49.00   85.00   791.00

Number of cases read:  3    Number of cases listed:  3

DATASET ACTIVATE TestData   WINDOW=FRONT.
DATASET COPY     Production WINDOW=FRONT.
DATASET ACTIVATE Production WINDOW=FRONT.
*  ...............   Initialize constants      ................. .

*  The following code >REQUIRES< that angles in the base system  .
*  (SPSS) be in radians, so that the trigonometric distance is   .
*  in radians, and can be multiplied directly by the Earth's     .
*  radius.                                                       .

DO IF $CASENUM EQ 1.
*  These initializations >MUST< be performed:                    .
*  #EarthRad is the Earth's radius in whatever units you please; .
*  the calculated distance will be in those units:               .
*              6,372.7976    km,                                 .
*              3,959.873     statute miles,                      .
*              3,441.035     nautical miles.                     .
.  COMPUTE
    #EarthRad =  3959.873  /* statute miles */.

*  #AngleCvt is the number of your angle units (degrees, here)   .
*  in one of SPSS's angle units (radians). It uses that          .
*  ARCTAN(1)is PI/4 radians or (in any angle measure) 1/8 circle .
.
.  COMPUTE
    #AngleCvt =  360  /* Number of input units in a full circle */
               /(8*ARTAN(1)).

END IF.

*  ...............   Compute distance          ................  .
*  Compute distance between points with coordinates              .
*  (lat1,lon1) and (lat2,lon2)                                   .

compute distance = #EarthRad*
     (2*artan(1)-arsin(      sin(lat1/#AngleCvt)   /* (sin(lat1)  */
                            *sin(lat2/#AngleCvt)   /* .sin(lat2)  */
                          +  cos(lat1/#AngleCvt)   /* +cos(lat1)  */
                            *cos(lat2/#AngleCvt)   /* .cos(lat2)  */
                            *cos(lon2/#AngleCvt    /* .cos(long2  */
                                -lon1/#AngleCvt)   /*     -long1))*/
                       )).

FORMAT    Distance (F7.2).

COMPUTE   DeltaPct = 100*(Distance/Given-1).
FORMATS   DeltaPct (PCT7.2).

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-OCT-2008 20:11:32       |
|-----------------------------|---------------------------|
[Production]

City1    lat1    lon1 City2    lat2    lon2    Given distance DeltaPct

Pvd     41.90   87.65 Chi     41.73   71.43   836.27   834.33    -.23%
A1      42.00   80.00 A2      39.00   70.00   564.33   564.52     .03%
B1      44.00   70.00 B2      49.00   85.00   791.00   791.03     .00%

Number of cases read:  3    Number of cases listed:  3
=================================
APPENDIX: Test data, and code
(WRR: Code also in separate file)
=================================
*  ................................................................. .
*  .................   Test data               ..................... .
*  Test data (one case, only) from Jon Peck's posting                .
*  Date:    Thu, 7 Jun 2007 09:53:06 -0500                           .
*  From:    "Peck, Jon" <[hidden email]>                              .
*  Subject: Re: Function for arc cosine                              .
*  To:      [hidden email]                                 .
*                                                                    .
*  "In the extendedTransforms module on SPSS Developer Central       .
*  (www.spss.com/devcentral), there are two functions that           .
*  implement distance calculations on Earth latitude and longitude   .
*  coordinates."                                                     .
*                                                                    .
*  Here is a simple usage example for just a single distance pair."  .
*                                                                    .
*  fromloc = (41.90, 87.65)                                          .
*  toloc   = (41.73, 71.43)                                          .
*                                                                    .
*  "Using the coordinates above (Chicago, IL to Providence, RI),     .
*  the ellipsoidal and spherical results are                         .
*  836.2705 and 834.0903 miles."                                     .

*  Examples A and B calculated using                                 .
*  http://www.movable-type.co.uk/scripts/latlong.html                .
DATA LIST LIST
   /City1      lat1 lon1    City2    lat2 lon2 Given
    (A3,       2F6.2,       A3,      2F6.2,     F7.2) .
BEGIN DAT
    Pvd        41.90, 87.65 Chi      41.73, 71.43 836.2705
    A1         42     80    A2       39     70    564.33         908.2 km
    B1         44     70    B2       49     85    791.00         1273  km
END DATA.
DATASET NAME TestData.
LIST.

*  ............................................................. .
*  .................   Code                ..................... .


DATASET ACTIVATE TestData   WINDOW=FRONT.
DATASET COPY     Production WINDOW=FRONT.
DATASET ACTIVATE Production WINDOW=FRONT.
*  ...............   Initialize constants      ................. .


*  The following code >REQUIRES< that angles in the base system  .
*  (SPSS) be in radians, so that the trigonometric distance is   .
*  in radians, and can be multiplied directly by the Earth's     .
*  radius.                                                       .

DO IF $CASENUM EQ 1.
*  These initializations >MUST< be performed:                    .
*  #EarthRad is the Earth's radius in whatever units you please; .
*  the calculated distance will be in those units:               .
*              6,372.7976    km,                                 .
*              3,959.873     statute miles,                      .
*              3,441.035     nautical miles.                     .
.  COMPUTE
    #EarthRad =  3959.873  /* statute miles */.

*  #AngleCvt is the number of your angle units (degrees, here)   .
*  in one of SPSS's angle units (radians). It uses that          .
*  ARCTAN(1)is PI/4 radians or (in any angle measure) 1/8 circle .
.
.  COMPUTE
    #AngleCvt =  360  /* Number of input units in a full circle */
               /(8*ARTAN(1)).

END IF.

*  ...............   Compute distance          ................  .
*  Compute distance between points with coordinates              .
*  (lat1,lon1) and (lat2,lon2)                                   .

compute distance = #EarthRad*
     (2*artan(1)-arsin(      sin(lat1/#AngleCvt)   /* (sin(lat1)  */
                            *sin(lat2/#AngleCvt)   /* .sin(lat2)  */
                          +  cos(lat1/#AngleCvt)   /* +cos(lat1)  */
                            *cos(lat2/#AngleCvt)   /* .cos(lat2)  */
                            *cos(lon2/#AngleCvt    /* .cos(long2  */
                                -lon1/#AngleCvt)   /*     -long1))*/
                       )).

FORMAT    Distance (F7.2).

COMPUTE   DeltaPct = 100*(Distance/Given-1).
FORMATS   DeltaPct (PCT7.2).

LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: zip code distance

Clive Downs
In reply to this post by paul wilson-7
Hi Richard,

Thank you very much for amending this syntax. This code will be very useful
to me. Also, I will try to improve my knowledge of trigonometry!

Regards,

Clive.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD