Zip Code Proximity Search (math problem relating to radius of earth)

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
Hi Guys,

I'm trying to integrate a zip code proximity search function into my website, where as users can enter their zip code and a declare an acceptable distance. ie 50 miles. I know this might seem like a programming issue, and it is, but this also involves some geographical math so I thought non programmers might have some insight as well. Mods please move to programming forum if I'm out of bounds here.

I downloaded the $5 team redline zip code database that is recommended on countless sites and that works fine. I was able to import the CSV text file into SQL Server without a problem and setup the datasource. I can do queries on the datasource now and it works fine.

Each zip code record includes postal code, city, state, latitude and longitude. I have a basic test page for this function, which includes a form field for the zip code and a drop down menu for the mileage radius.

I'm using code found from this tutorial.

http://tutorial153.easycfm.com

Here is the code that is in question

<cfquery datasource="zip_codes" name="getlocs">
SELECT zipcode, latitude, longitude, state, city,
ROUND((ACOS((SIN(#passedzip.latitude#/57.2958) * SIN(latitude/57.2958)) +
(COS(#passedzip.latitude#/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - #passedzip.longitude#/57.2958))))
* 3963, 3) AS distance
FROM zipcodes
WHERE (latitude >= #passedzip.latitude# - (#FORM.passedradius#/111))
And (latitude <= #passedzip.latitude# + (#FORM.passedradius#/111))
AND (longitude >= #passedzip.longitude# - (#FORM.passedradius#/111))
AND (longitude <= #passedzip.longitude# + (#FORM.passedradius#/111))
ORDER BY zipcode
</cfquery>

What is happening is that if I select a mileage that is below 111, I only get 1 or 2 results back. Even if I put in the central St. Louis zip code, 63101 and do a search for 50 mile radius, I only get 2 zip codes back. There are many more zip codes than that. If I put the search to 110 mile radius, I still only get 1 or 2 results.

However, if I search for 111 or more, it seems that the search becomes accurate and returns 100's of results.

Now, 111 seems to be the magic number. There is approx 111 miles in a longitude/latitude degree, which is why the mileage is divided by 111 in the SQL statement above.

(#FORM.passedradius#/111)

My guess is that if the radius value is less than 111 and is divided by 111, the value is less than 1, and that seems to break the function. Any ideas why? Perhaps once it gets to a sub 1 value like .80 it doesn't calculate the radius correctly?
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
From your description it sounds like a floating point vs integer issue.
 

Train

Lifer
Jun 22, 2000
13,579
75
91
www.bing.com
Your where clause totally loses me. You calculate the distance in miles, why noy just put a where clause on that?

WHere (Your Distance Calculation) < 50 (or whatever radius you want)
 

Legendary

Diamond Member
Jan 22, 2002
7,019
1
0
From your description it sounds like a floating point vs integer issue.

my first thought as well

Change
(#FORM.passedradius#/111))

to
((#FORM.passedradius#) * 1.0/111))

and the same for the rest of those /111 clauses


Where 0 < form.passedradium < 111, the first expression above will eval to 0, the second will eval to 0 < x < 1
 

Legendary

Diamond Member
Jan 22, 2002
7,019
1
0
Na, SQL server doesnt allow loose typing, if int/float math was the issue, he woud have gotten an error

Wouldn't throw an error, it would proceed normally. SELECT CAST(3 AS INT) / 1.5 = 2 in SQL server

But that's not the issue here...in this scenario the input variable #form.passedradius is an int and the 111 is treated as an int, producing integer division
 

Train

Lifer
Jun 22, 2000
13,579
75
91
www.bing.com
Now, 111 seems to be the magic number. There is approx 111 miles in a longitude/latitude degree, which is why the mileage is divided by 111 in the SQL statement above.

uhh, WTF did you pull that number from? Miles per degree is not a constant. At the poles, the distance between longitude lines is ZERO, at the equator its 69 miles.
 

Legendary

Diamond Member
Jan 22, 2002
7,019
1
0
Why not write it like this?

SELECT zipcode, latitude, longitude, state, city,
ROUND((ACOS((SIN(#passedzip.latitude#/57.2958) * SIN(latitude/57.2958)) +
(COS(#passedzip.latitude#/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - #passedzip.longitude#/57.2958))))
* 3963, 3) AS distance
FROM zipcodes
WHERE ROUND((ACOS((SIN(#passedzip.latitude#/57.2958) * SIN(latitude/57.2958)) +
(COS(#passedzip.latitude#/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - #passedzip.longitude#/57.2958))))
* 3963, 3) <= #form.passedradius
ORDER BY 6

?
 

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
uhh, WTF did you pull that number from? Miles per degree is not a constant. At the poles, the distance between longitude lines is ZERO, at the equator its 69 miles.

Then I misread where I got the 111 from. I didn't write this code. It was from a tutorial and it seems to work fine for others..
 

Train

Lifer
Jun 22, 2000
13,579
75
91
www.bing.com
I've got a WORKING sql query to do this. on a app I wrote like 8 years ago. I cant get to it from here though, If no one posts it I will log into it when i get home.

I'm pretty sure the only constant you need is the earths circumfrence at the equator in whichever unit you want to use (Miles, Kilometers)
 

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
my first thought as well

Change
(#FORM.passedradius#/111))

to
((#FORM.passedradius#) * 1.0/111))

and the same for the rest of those /111 clauses


Where 0 < form.passedradium < 111, the first expression above will eval to 0, the second will eval to 0 < x < 1

tried that.. same results
 

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
Why not write it like this?

SELECT zipcode, latitude, longitude, state, city,
ROUND((ACOS((SIN(#passedzip.latitude#/57.2958) * SIN(latitude/57.2958)) +
(COS(#passedzip.latitude#/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - #passedzip.longitude#/57.2958))))
* 3963, 3) AS distance
FROM zipcodes
WHERE ROUND((ACOS((SIN(#passedzip.latitude#/57.2958) * SIN(latitude/57.2958)) +
(COS(#passedzip.latitude#/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - #passedzip.longitude#/57.2958))))
* 3963, 3) <= #form.passedradius
ORDER BY 6

?

Used that to build

SELECT zipcode, latitude, longitude, state, city,
ROUND((ACOS((SIN(38.63/57.2958) * SIN(latitude/57.2958)) +
(COS(38.63/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - -90.19/57.2958))))
* 3963, 3) AS distance
FROM zipcodes
WHERE ROUND((ACOS((SIN(38.63/57.2958) * SIN(latitude/57.2958)) +
(COS(38.63/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - -90.19/57.2958))))
* 3963, 3) <= 10
ORDER BY ZipCode

and I get an error

Server: Msg 8115, Level 16, State 8, Line 1
Arithmetic overflow error converting numeric to data type numeric.
 

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
Ok, here are some results from Query Analyzer on MS-SQL. My first tests were using the application server (openbd) via my cfml code. I guess it's not a BD issue. It's definately how SQL Server is interpreting the sql.

I'm using the central St. Louis zip code, 63101, one that you can walk several blocks and be in a different zip. So there are different zips right around each other. So even a 5 mile radius should yield a few results.

using the zip 63101, and selecting 5 miles I received the following results (keep in mind, there is a 50 zip code difference between the only two records. 63102 is the closest zip, or should be. The lat and long are the same too. weird. Oh, and distance is 0

Zip-Lat-Long-state-city-distance
63101 38.63 -90.19 MISSOURI SAINT LOUIS 0.0
63150 38.63 -90.19 MISSOURI SAINT LOUIS 0.0

using the zip 63101, and selecting 110 miles I received the following results

63101 38.63 -90.19 MISSOURI SAINT LOUIS 0.0
63150 38.63 -90.19 MISSOURI SAINT LOUIS 0.0

using the zip 63101, and selecting 111 miles (just 1 extra mile vs the last test) I received 415 rows of results looking like

62001 38.88 -89.74 ILLINOIS ALHAMBRA 29.802
62002 38.90 -90.15 ILLINOIS ALTON 18.798999999999999
62006 39.10 -90.65 ILLINOIS BATCHTOWN 40.872
62009 39.09 -89.79 ILLINOIS BENLD 38.424999999999997
 

Train

Lifer
Jun 22, 2000
13,579
75
91
www.bing.com
Heres my sproc, cut and pasted, havent used it in years, so no warranties, expressed, or implied. Was originally written for SQL Server 2000

ALTER PROCEDURE [dbo].[spGetZipsByRadius]
@ZIP int,
@Radius int

AS
Declare @Latitude float
Declare @Longitude float
Select @Latitude = (Select Top 1 Latitude From Zips Order By abs(@Zip - Zip) asc)
Select @Longitude = (Select Top 1 Longitude From Zips Order By abs(@Zip - Zip) asc)
Select zip, state, name, (3985*ACOS(COS(RADIANS(90-@Latitude))*COS(RADIANS(90-Latitude))+
SIN(RADIANS(90-@Latitude))*SIN(RADIANS(90-Latitude))*COS(RADIANS(@Longitude-Longitude)))) as Distance
from zips where (3985*ACOS(COS(RADIANS(90-@Latitude))*COS(RADIANS(90-Latitude))+
SIN(RADIANS(90-@Latitude))*SIN(RADIANS(90-Latitude))*COS(RADIANS(@Longitude-Longitude)))) < @Radius
order by Distance asc

I think my constant for earth circumfrence might be wrong. But it depends who you ask because the earth isn't really a sphere. Also, the above code handles zip codes that dont exist, since this will happen a lot, pretty much any free zip database you will get it going to be out of date.
 
Last edited:

IronWing

No Lifer
Jul 20, 2001
71,270
30,779
136
Have you tried having the script print back the values of the various variables to see what they are?
 

Legendary

Diamond Member
Jan 22, 2002
7,019
1
0
Used that to build

SELECT zipcode, latitude, longitude, state, city,
ROUND((ACOS((SIN(38.63/57.2958) * SIN(latitude/57.2958)) +
(COS(38.63/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - -90.19/57.2958))))
* 3963, 3) AS distance
FROM zipcodes
WHERE ROUND((ACOS((SIN(38.63/57.2958) * SIN(latitude/57.2958)) +
(COS(38.63/57.2958) * COS(latitude/57.2958) *
COS(longitude/57.2958 - -90.19/57.2958))))
* 3963, 3) <= 10
ORDER BY ZipCode

and I get an error

Server: Msg 8115, Level 16, State 8, Line 1
Arithmetic overflow error converting numeric to data type numeric.


This exact code produces 68 results in my zipcode DB which is on SQL server 2008 so not sure what's going on

can you provide the schema of your zipcode table?
 

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
This exact code produces 68 results in my zipcode DB which is on SQL server 2008 so not sure what's going on

can you provide the schema of your zipcode table?

I'm on SQL 2000. I know I need to upgrade but it's all I got at the moment.

What happens if you change the range to just 2 miles?
 

Legendary

Diamond Member
Jan 22, 2002
7,019
1
0
I'm on SQL 2000. I know I need to upgrade but it's all I got at the moment.

What happens if you change the range to just 2 miles?

25 results

it's unlikely it's a version issue - more likely your table columns are being implicitly cast in a bad way
 

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
25 results

it's unlikely it's a version issue - more likely your table columns are being implicitly cast in a bad way

Can you send me an outline of how your zipcode table is structured? bit/varchar, etc.

Or, how should it be

ZipCode
Latitude
Longitude
City
State
ABBR
 

Legendary

Diamond Member
Jan 22, 2002
7,019
1
0
Can you send me an outline of how your zipcode table is structured? bit/varchar, etc.

Or, how should it be

ZipCode int or varchar (doesn't matter, no math operations on this column)
Latitude float
Longitude float
City varchar
State varchar
ABBR char

bolded