Taming the OGR

This evening I made a protoype of a smoother, simpler interface to the industrial-strength vector data functions in libgdal. The new entities in Refinery are workspaces, collections, and features. Workspace.collections is a dict of Collection objects, which are approximately equal to OGR layers. Collection.features is an iterator over GeoJSON-style objects:

>>> workspace = refinery.workspace('/var/gis/data/world')
>>> workspace.collections.keys()
['world_borders']
>>> borders = workspace.collections['world_borders']
>>> for feature in borders.features:
...     print feature.id, feature.properties['CNTRY_NAME']
world_borders/0 Aruba
(etc)

Use of the refinery package reduces the size of my canonical matplotlib script from 17 to 12 lines:

import pylab
from numpy import asarray
import refinery
from shapely.wkb import loads

workspace = refinery.workspace('/var/gis/data/world')
borders = workspace.collections['world_borders']

fig = pylab.figure(1, figsize=(4,2), dpi=300)

for feature in borders.features:
    shape = loads(feature.geometry)
    a = asarray(shape.exterior)
    pylab.plot(a[:,0], a[:,1])

pylab.show()

That's starting to feel civilized.

This started off as recreational programming, but I think it has some promise. Get the code [Refinery-0.0.tar.gz] if you're curious and make a trial workspace from your own data. Next: a look at the GeoDjango GDAL wrappers to see what Justin and I are doing differently.

Comments

Re: Taming the OGR

Author: Matt Perry

Very cool. Some nice syntactic sugar on top of the unpythonic (but useful) OGR API. Out of curiosity, why "collections" rather than the more standard terminology of "layers"?

Re: Taming the OGR

Author: Sean

Thanks, Matt. My other favorite feature is no dependence on SWIG. This uses the Python ctypes module, which is standard in 2.5. The term 'collection' is inspired by WFS, AtomPub, and GeoJSON. Is 'layer' really standard outside the GDAL community?

Re: Taming the OGR

Author: Matt Perry

Ah! I assumed it was a thin layer on top of OGR SWIG bindings.. ctypes looks like the future of FFI and it's nice to see it being put to such good use! Layer is pretty much the universal word across all GIS camps for a group of vector features. This is the first I've ever heard of "collections" in reference to spatial data (apparently I don't delve too much into the web-based world too often these days ;-).

Re: Taming the OGR

Author: Paul Ramsey

I think "Collection" is a better term for a set of features. "FeatureCollection" is somewhat more explicit. A "Layer" is a styled "FeatureCollection", it's the thing you see in your map, it's not the data itself, and one "FeatureCollection" can be the basis for multiple "Layers".

Re: Taming the OGR

Author: Matt Perry

I like that. A workspace contains collections of features. Each collection can be paired with one or more styles to create layers. The only reason I'm harping on this is that anyone with a traditional GIS background will be flummoxed by the "collection" terminology at first. The last thing we need in the geospatial world is more confusion over vocabulary ... I'm still dealing with the whole "coverage" debacle!