Friday 28 August 2009

Adventures with SpatiaLite

A while ago I came across SpatiaLite. Its particularly interesting to me as we (the company I work for) need to deploy vast amounts of mapping data on machines where it is not always possible to connect to a dedicated mapping server. For us, this means installing PostGIS onto every client and restoring 20+GB of spatial data onto each one.

I wanted to use SpatiaLite instead of PostGIS. That way, deploying the mapping data onto the client would become as simple as copying across a flat file. It sounded like the perfect tool for the job. However, as with all things, nothing is ever easy. Here are a couple of the problems I faced:

  1. We deal mainly with mastermap data and use a proprietary database loader to upload it to PostGIS. This loader does not support SpatiaLite so I had to figure out how I was going to get it into the SpatiaLite format.


  2. We use the SharpMap library to render our mapping data, which has extensions to support PostGIS. I would have to find or write a custom data provider if I wanted to use SpatiaLite instead.

Converting Ordnance Survey Mastermap data to SpatiaLite

First the easy (but less than satisfactory) way...

As we already had our data in PostGIS, I was able to use the latest version of fwtools to convert the existing database to SpatiaLite. This was surprisingly easy once i'd figured out all the arguments. It preserved our existing schema and was reasonably efficient. The command I used can be seen below:

ogr2ogr -f SQLite -dsco "SPATIALITE=yes" london.sqlite PG:"host=localhost user=postgres dbname=london password=postgres" mastermap

I had three main issues with this approach. Firstly, the latest version of fwtools (2.4.2) is not linked against libspatialite, meaning I was unable to create the spatial index from the command line and had to use the spatialite tools to do this as another manual step. Secondly, and more importantly, I had no way to apply change only updates (COU's) to the SpatiaLite database. I would have to apply them to PostGIS and then re-convert the whole lot. Thirdly, its just plain painful. encoding/uploading the data to PostGIS can take a long time. Converting it to SpatiaLite as a secondary stage doubles the time involved. Wouldn't it be nice if you could go straight from the original mastermap GML to the SpatiaLite database and handle COU's as well...

A Better Approach...

I'd had my eye on an open source converter for a while. After several experiments I had managed to get it translating and uploading the mastermap GML to a PostGIS database according to our schema (the tool uses a schema recommended by Ordnance Survey but, sadly, very different to the one our proprietry tool uses). Due to the foresight of the author, all this involved in the end was tweaking the xsd and xslt files (Actually that's not quite true, i had to change the source a bit as well so that it created the PostGIS tables in a slightly different way..but that's by the by).

After playing around with the 'Ogr converter' option, I finally bit the bullet and decided to have a crack at extending the tool to output directly to SpatiaLite. A few hours later (and after much reading up) I had a working converter. It could handle bulk loads, COU's and spatial index creation. It was fast and, thanks to its use of xsd/xslt, very flexible. I must take my hat of to the author of this tool for making it so extensible. Great job.

Now, just how was I going to plug this into our rendering engine? Read on...

Rendering SpatiaLite data using SharpMap

Bill dollins posted a SpatiaLite data provider on his blog a while back. Just the job. However, when I came to put it all together I was left twiddling my thumbs. It was taking an absolute age to render. The problem was that SpatiaLite does not automatically use the spatial index. Instead, you have to build a special query to make use of it...and the data provider I'd nabbed from Bill's site didn't make any use of it.

I ended up making a few changes to allow the data provider to be constructed with information about the spatial index table (yes, its implemented as a table, not a true index) and then modified the spatial query to make use of it. The main change was to the GetBoxClause method which can be seen below:

 private string _SpatialIndex;

/// <summary>
/// Name of the spatial index table
/// </summary>
public string SpatialIndex
{
get { return _SpatialIndex; }
set { _SpatialIndex = value; }
}

private string GetBoxClause(SharpMap.Geometries.BoundingBox bbox)
{
if (!string.IsNullOrEmpty(SpatialIndex))
{
StringBuilder sql = new StringBuilder("ROWID IN ( ");
sql.Append("SELECT pkid FROM ");
sql.Append(SpatialIndex);
sql.Append(" WHERE ");
sql.AppendFormat(SharpMap.Map.numberFormat_EnUS,
"xmin < {0} AND xmax > {1} AND ymin < {2} AND ymax > {3} )",
bbox.Max.X, bbox.Min.X, bbox.Max.Y, bbox.Min.Y);

return sql.ToString();
}

string wkt = SharpMap.Converters.WellKnownText.GeometryToWKT.Write(LineFromBbox(bbox));
return "MBRIntersects(GeomFromText('" + wkt + "')," + _GeometryColumn + ")=1";
}


After making the changes, everything performed blisteringly fast. Perhaps even slightly faster than it did with PostGIS. So there you have it. A zero configuration GIS database for storing mastermap thats as easy to deploy as copying a file. All thanks to SpatiaLite (and a bit of work here and there).

Thursday 27 August 2009

To blog or not to blog...

I've never really understood blogging. I mean, blogs are great for finding things out but actually writing one...now why would you want to do that?

Well, i guess maybe its time to give something back. Or maybe, its just time to start recording some of the nerdy little gems that make life as a software engineer just that bit sweeter.

So here it is. A blog. Written by me. Who would have thought?

My intention is to start writing a series of posts dealing with Ordnance Survey mapping data and Open Source GIS (undoubtedly paying much attention to SharpMap along the way). Anyone who has to work with Ordnance Survey data may well understand the headache that can ensue from the vast amount of data processing and storage involved. Its often difficult to know where to start if your not buying into an enterprise GIS solution. So, if you don't want to splash out on proprietary software for dealing with all that GML...well you might just have come to the right place.