Working with MySQL Spatial Data

Monday, March 07, 2011 05:53 PM

When designing a schema to query data spatially (latitude / longitude), I started down the path of researching the MySQL spatial extensions. There is only one word that comes to mind to describe the time I spent doing this: BALLS. I even thought as far as ... maybe I could just store the spatial data in float columns and then go through the cumbersome task of re-implementing all of the geometry functions.

There's issues with this approach, depending on how you implement said functions you could prematurely round which would introduce inaccurate results. The most common pitfall in wanting to use these extensions is knowing which version of MySQL you have and what level of OpenGIS is supported.

Luckily I was running version 5.0.67.

When it comes to choosing an engine to go with there are some caveats:

  1. MyISAM supports spatial indexes.
  2. InnoDB does not support spatial indexes.

Schema (MyISAM)

CREATE TABLE `location_data` (
    `id` bigint(22) unsigned NOT NULL auto_increment,
    `location` POINT NOT NULL,
    `name` varchar(255) NOT NULL default '',
    `city` varchar(255) NOT NULL default '',
    `region_name` varchar(255) NOT NULL default '',
    `country` varchar(255) NOT NULL default '',
    PRIMARY KEY (`id`)
) ENGINE=MyISAM;

CREATE SPATIAL INDEX location_data_index ON location_data (location);

Schema (InnoDB)

CREATE TABLE `location_data` (
    `id` bigint(22) unsigned NOT NULL auto_increment,
    `location` POINT NOT NULL,
    `name` varchar(255) NOT NULL default '',
    `city` varchar(255) NOT NULL default '',
    `region_name` varchar(255) NOT NULL default '',
    `country` varchar(255) NOT NULL default '',
    PRIMARY KEY (`id`)
) ENGINE=InnoDB;

Inserting data

INSERT INTO
location_data 
(location, name, city, region_name, country) 
VALUES 
(GeomFromText( 'POINT(37.774929 -122.419415)' ), 'Philz', 'San Francisco', 'California', 'United States of America');

Latitude / Longitude query out of the Point column

SELECT 
X(location) as latitude, 
Y(location) as longitude
FROM 
location_data;

Records within a radius query

SELECT 
name, 
city, 
region_name, 
country, 
ROUND(GLength(LineStringFromWKB(LineString(AsBinary(location), AsBinary(GeomFromText('POINT(37.774929 -122.419415)')))))) AS distance 
FROM 
location_data 
HAVING distance < 3 ORDER BY distance ASC;

The big win here is relying upon MySQL's implementation of the OpenGIS functions. Unfortunately this was not tested with a large dataset (would love to see an implementation with hundreds of millions of rows). My live implementation uses the InnoDB engine and so far none of the radius based SELECT queries have taken greater than 500 ms.

w00t.