Fiona

2011-09-21T19:25:21Z in python, programming, data

I while ago I created a project named WorldMill to learn about writing Python extension modules with Cython and experiment with designing a slicker OGR API. Interest in the project is rising again and after some discussion I've persuaded users that we should change the name because the *Mill project name space is getting a little crowded. The new project: Fiona. Fiona is OGR's neater (or nicer/nimble/no-nonsense) API – elegant on the outside, unstoppable OGR(e) power on the inside.

What I'd like out of Fiona: a clear alternative to the complex layers and cursors and fussy geometry objects of OGR and ArcPy; Python generators serving as sources and sinks of GeoJSON-like objects; and above all, no reference counting duty dumped on users, no need to explicitly "del" anything. I think an API like this would be productive and make new types of Python data processing programs possible. For example, one might use the enhanced generator protocol of PEP 342 to create pipelines of coroutines that receive and send GeoJSON-like objects, bringing into being something like a WSGI for Python spatial data processing. See https://gist.github.com/1232852 for the pipelinedemo module code wherein the pipeline components below are declared. The demo tasks simply increment the value of a particular feature property (adding the property if it doesn't already exist) and send the feature down the pipe. The demo writer appends received features to a list and serializes them to JSON in a file.

>>> from pipelinedemo import pipeline, task1, task2, writer
>>> features = [{'id': "1"}, {'id': "2"}, {'id': "3"}]
>>> pipeline(
...     features,
...     task1(
...         task2(
...              writer(open("pipeline-demo.json", "w")))
...         )
...     )
>>> print open("pipeline-demo.json").read()
{
  "features": [
    {
      "id": "1",
      "properties": {
        "count": 2
      }
    },
    {
      "id": "2",
      "properties": {
        "count": 2
      }
    },
    {
      "id": "3",
      "properties": {
        "count": 2
      }
    }
  ]
}

Fiona already provides feature source generators that leverage the OGR format drivers. Work on the feature sinks at the other end of the processing pipeline is clearly the next step. Follow or fork; your ideas and pull requests are welcome.

By the way, A Curious Course on Coroutines and Concurrency has a very readable introduction to this "push" style of pipeline for Python along with excellent advice in general on using enhanced generators.

Comments are closed after 13 days.

about archive feed find

Some rights reserved by Sean Gillies.