|
I have a database of customers that includes their address and zip code.
I'd like to calculate each customer's distance from the store so each customer in the database has the distance in either miles or km next to his/her name. I'm guessing it's possible to somehow calculate distance between zip codes. Can this be done in SPSS? If so I'd greatly appreciate if someone can explain how and provide an example code. Thanks a lot! __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Paul,
If you can populate the longitude and latitude of the zip code in the dataset, then the distance calculation becomes a trivial trigonometric formula. Here is small script (SQL) that calculates the distance between two zipcodes: -- sets average radius of earth in miles @radius = 3958.75 @a = sqrt(square(sin((@latitude2-@latitude1)/2.0e)) + (cos(@latitude1) * cos(@latitude2) * square(sin((@longitude2- @longitude1)/2.0e))) ) @distance = @radius * ( 2.0e *asin(case when 1.0e < @a then 1.0e else @a end )) If you don't have longitude and latitude data, you can always download from Google and map it back to your dataset. If you can't find it, please send me an email, I'll send an Excel file with this data. Thank you, Nagu Nagaraju Thogiti | Data Mining Analyst [hidden email] Tel: (310) 354-5600 x 4885 Fax: (310) 630-3142 AF Services, LLC. www.pcmall.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of paul wilson Sent: Thursday, October 16, 2008 5:10 PM To: [hidden email] Subject: zip code distance I have a database of customers that includes their address and zip code. I'd like to calculate each customer's distance from the store so each customer in the database has the distance in either miles or km next to his/her name. I'm guessing it's possible to somehow calculate distance between zip codes. Can this be done in SPSS? If so I'd greatly appreciate if someone can explain how and provide an example code. Thanks a lot! __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by paul wilson-7
At 08:09 PM 10/16/2008, paul wilson wrote:
>I have a database of customers that includes >their address and zip code. I'd like to >calculate each customer's distance from the >store so each customer in the database has the >distance in either miles or km next to his/her >name. I'm guessing it's possible to somehow >calculate distance between zip codes. There are two parts to this: First, finding 'where' a ZIP code is, i.e. the latitude and longitude of its centroid; then, calculating distance between two points, with their latitude and longitude given. For a database of ZIP centroids, I've Googled, but you can follow up on this as well as I can. There are various commercial databases. Looking for free ones, I found (but did not explore), http://geocoder.ibegin.com/downloads.php offering for download, >Zip Codes > >File of all US ZIP codes. Includes zip, city, >state, latitude, longitude, and county (Last Updated: January 18, 2008): http://spatialnews.geocomm.com/newsletter/2000/jan/zipcodes.html A list of sources, with a cover article that states, >"Unfortunately, free postal data in a useable >format is very tough to come by. We've put >together this collection of resources to help >you out. Commercial data is available, however, >don't expect to get it for free - lots of work >has gone into developing this data and keeping it up to date." And, see what you can find at http://www.stumbleupon.com/tag/zip-code-database/ .................... Nagaraju Thogiti posted a SQL calculation to find the distance between two points, with latitude and longitude given. This can also be done in SPSS transformation language(1), (2); the code below has not been tested since editing. There's also a more accurate Python solution on Developer Central(4). * Initialize constants ............ . DO IF $CASENUM EQ 1. * #EarthRad is the Earth's radius in whatever units you please; . * the calculated distance will be in those units. Radius (3): . * 6,372.7976 km, * 3,959.873 statute miles, * 3,441.035 nautical miles. . COMPUTE #EarthRad = 6,372.7976 /* km */. * #AngleCvt is the number of your angle units (degrees) in . * one of SPSS's angle units (radians). It uses that ARCTAN(1) . * is PI/4 radians. . #AngleCvt = 360 /* Number of input units in a full circle */. /(8*ARCTAN(1)). END IF. * Compute distance between points with coordinates . * (lat1,lon1) and (lat2,lon2) . compute distance = #EarthRad* arcos(sin(lat1/#AngleCvt)*sin(lat2/#AngleCvt) + cos(lat1/#AngleCvt)*cos(lat2/#AngleCvt *cos(lon2/#AngleCvt-lon1/#AngleCvt)). -------------------- (1)Date: Tue, 12 Aug 2008 15:56:48 -0400 From: Richard Ristow <[hidden email]> Subject: Re: Calculating Distance using Latitude and Longitude To: [hidden email] (2)Date: Thu, 31 May 2007 11:35:53 -0400 From: Richard Ristow <[hidden email]> Subject: Re: Function for arc cosine To: [hidden email] (3) From http://en.wikipedia.org/wiki/Earth_radius: Quadratic mean radius: Qr "The ellipsoidal quadratic mean radius provides the best approximation of Earth's average transverse meridional radius and radius of curvature ... This is the equivalent spherical "great-circle" radius of the ellipsoid). For Earth, Qr equals 6,372.7976 km (3,959.873 mi; 3,441.035 nmi)." See also http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html) (4) Date: Mon, 11 Aug 2008 13:36:19 -0500 From: "Peck, Jon" <[hidden email]> Subject: Re: SPSS Syntax Calculating Distance using Latitude and Longitude To: [hidden email] "In the extendedTransforms.py programmability module on SPSS Developer Central (www.spss.com/devcentral) there are two functions for this: sphDist and ellipseDist. They take latitude and longitude values in degrees or in radians and calculate the distance. sphDist uses a spherical approximation to the earth and ellipseDist uses an ellipsoidal approximation, which is a bit more accurate." ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
There are two sets of issues with using zip codes in this way: the acquisition of the data and the technical computation are addressed by Richard, but one might wonder whether this calculation is a good idea.
What is meant by the distance of a zip code from a store? Zip codes are not necessarily nice, compact shapes or even geometrically closed: the Sears Tower has a nine-digit zip for each floor. Unless the areas are acceptably far from the store (mail order?), the distance may not be well defined. And, of course, geographical distance may not be a good indicator of travel time, if that is the latent idea. So this enterprise really depends on what is to be made of these distances. Regards, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow Sent: Wednesday, October 22, 2008 9:34 PM To: [hidden email] Subject: Re: [SPSSX-L] zip code distance At 08:09 PM 10/16/2008, paul wilson wrote: >I have a database of customers that includes >their address and zip code. I'd like to >calculate each customer's distance from the >store so each customer in the database has the >distance in either miles or km next to his/her >name. I'm guessing it's possible to somehow >calculate distance between zip codes. There are two parts to this: First, finding 'where' a ZIP code is, i.e. the latitude and longitude of its centroid; then, calculating distance between two points, with their latitude and longitude given. For a database of ZIP centroids, I've Googled, but you can follow up on this as well as I can. There are various commercial databases. Looking for free ones, I found (but did not explore), http://geocoder.ibegin.com/downloads.php offering for download, >Zip Codes > >File of all US ZIP codes. Includes zip, city, >state, latitude, longitude, and county (Last Updated: January 18, 2008): http://spatialnews.geocomm.com/newsletter/2000/jan/zipcodes.html A list of sources, with a cover article that states, >"Unfortunately, free postal data in a useable >format is very tough to come by. We've put >together this collection of resources to help >you out. Commercial data is available, however, >don't expect to get it for free - lots of work >has gone into developing this data and keeping it up to date." And, see what you can find at http://www.stumbleupon.com/tag/zip-code-database/ .................... Nagaraju Thogiti posted a SQL calculation to find the distance between two points, with latitude and longitude given. This can also be done in SPSS transformation language(1), (2); the code below has not been tested since editing. There's also a more accurate Python solution on Developer Central(4). * Initialize constants ............ . DO IF $CASENUM EQ 1. * #EarthRad is the Earth's radius in whatever units you please; . * the calculated distance will be in those units. Radius (3): . * 6,372.7976 km, * 3,959.873 statute miles, * 3,441.035 nautical miles. . COMPUTE #EarthRad = 6,372.7976 /* km */. * #AngleCvt is the number of your angle units (degrees) in . * one of SPSS's angle units (radians). It uses that ARCTAN(1) . * is PI/4 radians. . #AngleCvt = 360 /* Number of input units in a full circle */. /(8*ARCTAN(1)). END IF. * Compute distance between points with coordinates . * (lat1,lon1) and (lat2,lon2) . compute distance = #EarthRad* arcos(sin(lat1/#AngleCvt)*sin(lat2/#AngleCvt) + cos(lat1/#AngleCvt)*cos(lat2/#AngleCvt *cos(lon2/#AngleCvt-lon1/#AngleCvt)). -------------------- (1)Date: Tue, 12 Aug 2008 15:56:48 -0400 From: Richard Ristow <[hidden email]> Subject: Re: Calculating Distance using Latitude and Longitude To: [hidden email] (2)Date: Thu, 31 May 2007 11:35:53 -0400 From: Richard Ristow <[hidden email]> Subject: Re: Function for arc cosine To: [hidden email] (3) From http://en.wikipedia.org/wiki/Earth_radius: Quadratic mean radius: Qr "The ellipsoidal quadratic mean radius provides the best approximation of Earth's average transverse meridional radius and radius of curvature ... This is the equivalent spherical "great-circle" radius of the ellipsoid). For Earth, Qr equals 6,372.7976 km (~3,959.873 mi; ~3,441.035 nmi)." See also http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html) (4) Date: Mon, 11 Aug 2008 13:36:19 -0500 From: "Peck, Jon" <[hidden email]> Subject: Re: SPSS Syntax Calculating Distance using Latitude and Longitude To: [hidden email] "In the extendedTransforms.py programmability module on SPSS Developer Central (www.spss.com/devcentral) there are two functions for this: sphDist and ellipseDist. They take latitude and longitude values in degrees or in radians and calculate the distance. sphDist uses a spherical approximation to the earth and ellipseDist uses an ellipsoidal approximation, which is a bit more accurate." ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by paul wilson-7
At 03:47 AM 10/24/2008, Downs, Clive asked, off-list:
>Thank you very much for your syntax for calculating distance between >two points defined by latitude and longitude (posted on SPSS-X list Aug 12). > >I have tried this out, and there were a couple of amendments I made ... This'll teach me to post complicated code without thorough testing. >One was to insert a COMPUTE keyword before the computation of #AngleCvt Sigh. An elementary syntax error. >There still seems to be a problem: I couldn't seem to find an SPSS >function ARCOS. Reasonably, since there isn't one. There is, however, ARSIN. ArcCosine can be computed using the identity ArcCos(x) = PI/2-ArcSin(X). or ArcCos(x) = ArcTan(1)-ArcSin(X). The following is tested. The 'given' values which are compared with the calculation are, . Providence to Chicago distance, from Jon Peck's posting "Re: Function for arc cosine", Thu, 7 Jun 2007 09:53:06 -0500 . Others, arbitrary test points with distance calculated at site http://www.movable-type.co.uk/scripts/latlong.html. I'm not sure why the tiny differences remain, but the code seems to be essentially reliable. It could be useful to wrap it in a macro; or in two macros, one for the initializations and one for the computation. Earth-radius values are from http://en.wikipedia.org/wiki/Earth_radius; see also http://nssdc.gsfc.nasa.gov/planetary/factsheet/earthfact.html. List |-----------------------------|---------------------------| |Output Created |28-OCT-2008 20:10:58 | |-----------------------------|---------------------------| [TestData] City1 lat1 lon1 City2 lat2 lon2 Given Pvd 41.90 87.65 Chi 41.73 71.43 836.27 A1 42.00 80.00 A2 39.00 70.00 564.33 B1 44.00 70.00 B2 49.00 85.00 791.00 Number of cases read: 3 Number of cases listed: 3 DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY Production WINDOW=FRONT. DATASET ACTIVATE Production WINDOW=FRONT. * ............... Initialize constants ................. . * The following code >REQUIRES< that angles in the base system . * (SPSS) be in radians, so that the trigonometric distance is . * in radians, and can be multiplied directly by the Earth's . * radius. . DO IF $CASENUM EQ 1. * These initializations >MUST< be performed: . * #EarthRad is the Earth's radius in whatever units you please; . * the calculated distance will be in those units: . * 6,372.7976 km, . * 3,959.873 statute miles, . * 3,441.035 nautical miles. . . COMPUTE #EarthRad = 3959.873 /* statute miles */. * #AngleCvt is the number of your angle units (degrees, here) . * in one of SPSS's angle units (radians). It uses that . * ARCTAN(1)is PI/4 radians or (in any angle measure) 1/8 circle . . . COMPUTE #AngleCvt = 360 /* Number of input units in a full circle */ /(8*ARTAN(1)). END IF. * ............... Compute distance ................ . * Compute distance between points with coordinates . * (lat1,lon1) and (lat2,lon2) . compute distance = #EarthRad* (2*artan(1)-arsin( sin(lat1/#AngleCvt) /* (sin(lat1) */ *sin(lat2/#AngleCvt) /* .sin(lat2) */ + cos(lat1/#AngleCvt) /* +cos(lat1) */ *cos(lat2/#AngleCvt) /* .cos(lat2) */ *cos(lon2/#AngleCvt /* .cos(long2 */ -lon1/#AngleCvt) /* -long1))*/ )). FORMAT Distance (F7.2). COMPUTE DeltaPct = 100*(Distance/Given-1). FORMATS DeltaPct (PCT7.2). LIST. List |-----------------------------|---------------------------| |Output Created |28-OCT-2008 20:11:32 | |-----------------------------|---------------------------| [Production] City1 lat1 lon1 City2 lat2 lon2 Given distance DeltaPct Pvd 41.90 87.65 Chi 41.73 71.43 836.27 834.33 -.23% A1 42.00 80.00 A2 39.00 70.00 564.33 564.52 .03% B1 44.00 70.00 B2 49.00 85.00 791.00 791.03 .00% Number of cases read: 3 Number of cases listed: 3 ================================= APPENDIX: Test data, and code (WRR: Code also in separate file) ================================= * ................................................................. . * ................. Test data ..................... . * Test data (one case, only) from Jon Peck's posting . * Date: Thu, 7 Jun 2007 09:53:06 -0500 . * From: "Peck, Jon" <[hidden email]> . * Subject: Re: Function for arc cosine . * To: [hidden email] . * . * "In the extendedTransforms module on SPSS Developer Central . * (www.spss.com/devcentral), there are two functions that . * implement distance calculations on Earth latitude and longitude . * coordinates." . * . * Here is a simple usage example for just a single distance pair." . * . * fromloc = (41.90, 87.65) . * toloc = (41.73, 71.43) . * . * "Using the coordinates above (Chicago, IL to Providence, RI), . * the ellipsoidal and spherical results are . * 836.2705 and 834.0903 miles." . * Examples A and B calculated using . * http://www.movable-type.co.uk/scripts/latlong.html . DATA LIST LIST /City1 lat1 lon1 City2 lat2 lon2 Given (A3, 2F6.2, A3, 2F6.2, F7.2) . BEGIN DAT Pvd 41.90, 87.65 Chi 41.73, 71.43 836.2705 A1 42 80 A2 39 70 564.33 908.2 km B1 44 70 B2 49 85 791.00 1273 km END DATA. DATASET NAME TestData. LIST. * ............................................................. . * ................. Code ..................... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY Production WINDOW=FRONT. DATASET ACTIVATE Production WINDOW=FRONT. * ............... Initialize constants ................. . * The following code >REQUIRES< that angles in the base system . * (SPSS) be in radians, so that the trigonometric distance is . * in radians, and can be multiplied directly by the Earth's . * radius. . DO IF $CASENUM EQ 1. * These initializations >MUST< be performed: . * #EarthRad is the Earth's radius in whatever units you please; . * the calculated distance will be in those units: . * 6,372.7976 km, . * 3,959.873 statute miles, . * 3,441.035 nautical miles. . . COMPUTE #EarthRad = 3959.873 /* statute miles */. * #AngleCvt is the number of your angle units (degrees, here) . * in one of SPSS's angle units (radians). It uses that . * ARCTAN(1)is PI/4 radians or (in any angle measure) 1/8 circle . . . COMPUTE #AngleCvt = 360 /* Number of input units in a full circle */ /(8*ARTAN(1)). END IF. * ............... Compute distance ................ . * Compute distance between points with coordinates . * (lat1,lon1) and (lat2,lon2) . compute distance = #EarthRad* (2*artan(1)-arsin( sin(lat1/#AngleCvt) /* (sin(lat1) */ *sin(lat2/#AngleCvt) /* .sin(lat2) */ + cos(lat1/#AngleCvt) /* +cos(lat1) */ *cos(lat2/#AngleCvt) /* .cos(lat2) */ *cos(lon2/#AngleCvt /* .cos(long2 */ -lon1/#AngleCvt) /* -long1))*/ )). FORMAT Distance (F7.2). COMPUTE DeltaPct = 100*(Distance/Given-1). FORMATS DeltaPct (PCT7.2). LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by paul wilson-7
Hi Richard,
Thank you very much for amending this syntax. This code will be very useful to me. Also, I will try to improve my knowledge of trigonometry! Regards, Clive. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
