Disclaimer: I don’t actually know anything about SQL performance. The techniques described below have been gleaned from other sources, seem to work for me, and it makes sense why they would. If you know more about SQL, please correct me. 🙂
—
Talking to Darren around Tech.Ed today, he expressed a need for a way to do proximity searches in SQL. These are queries along the lines of “give me all the records that talk about locations within 50km of X”. Now, in SQL 2008 this is baked into the product, but for SQL 2005 and below we need to do the maths ourselves.
The general query is relatively simple: calculate the distance-from-X for each row, then filter by the distance. Performing 6 trigonometric operations for every row, on every query is a pretty sure way to kill your average database though.
Instead, we add some persisted calculated columns to our table like so:
CREATE TABLE [dbo].[Locations]
(
[LocationID] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT [DF_Locations_LocationID] DEFAULT (newid()),
[Title] [varchar](100) NOT NULL,
[Latitude] [decimal](9, 6) NOT NULL,
[Longitude] [decimal](9, 6) NOT NULL,
[ProximityX] AS (cos(radians([Latitude]))*cos(radians([Longitude]))) PERSISTED,
[ProximityY] AS (cos(radians([Latitude]))*sin(radians([Longitude]))) PERSISTED,
[ProximityZ] AS (sin(radians([Latitude]))) PERSISTED,
CONSTRAINT [PK_Locations] PRIMARY KEY CLUSTERED
(
[LocationID] ASC
) WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
Basically, we take the reusable parts of our calculation and store them between queries. Storing these precalculated components means that our comparisons can be done using relatively simple maths and only one trig function per row.
Because they are persisted calculated columns, they are only calculated when the row is created or updated. SQL manages this for us.
Finally, here’s a stored procedure to query that table:
CREATE PROCEDURE [dbo].[FindLocationsNearby]
@CenterLat float,
@CenterLon float,
@SearchRadius float
AS
BEGIN
–Store the radius of the earth (in km so that we can search by km)
declare @EarthRadius float
set @EarthRadius = 6378.14–Calculate the X, Y and Z axis values for the center point
declare @CenterX float
declare @CenterY float
declare @CenterZ float
set @CenterX = cos(radians(@CenterLat)) * cos(radians(@CenterLon))
set @CenterY = cos(radians(@CenterLat)) * sin(radians(@CenterLon))
set @CenterZ = sin(radians(@CenterLat))–Perform the database search
SELECT Title,
Distance = @EarthRadius * acos(ProximityX * @CenterX + ProximityY * @CenterY + ProximityZ * @CenterZ)
FROM Locations
WHERE @EarthRadius * acos(ProximityX * @CenterX + ProximityY * @CenterY + ProximityZ * @CenterZ) <= @SearchRadius
ORDER BY Distance ASC
END
GO
This technique was extracted from an Access/SQL 2000 article I found on MSDN.
This is nifty but I much prefer a table valued function to a stored procedure for this sort of thing. Otherwise very useful, thanks!