GeoJSON and the geo interface for Python

It's pretty clear that GeoJSON is good for the web. I think it's also good for Python programmers in two different ways.

Fiona employs Python mappings modeled on GeoJSON to represent GIS features and their geometries. By reusing these existing concepts – the GeoJSON data model and the Python mapping type – Fiona keeps its mental footprint small. There's less to memorize, less need to check documentation, fewer gotchas. There's also less code and less room for bugs. For example, given a Fiona feature, there is no need to look up the classes and methods for accessing the feature's non-geometry fields, it's just feature['properties'].items().

>>> import fiona
>>> with fiona.open(
...         "/Users/seang/data/ne_50m_admin_0_countries/"
...         "ne_50m_admin_0_countries.shp") as c:
...     first = next(c)
...
>>> items = sorted(first['properties'].items())
>>> import pprint
>>> pprint.pprint(items)
[('abbrev', 'Aruba'),
 ('abbrev_len', 5.0),
 ('adm0_a3', 'ABW'),
 ('adm0_a3_is', 'ABW'),
 ('adm0_a3_un', -99.0),
 ('adm0_a3_us', 'ABW'),
 ('adm0_a3_wb', -99.0),
 ('adm0_dif', 1.0),
 ('admin', 'Aruba'),
 ('brk_a3', 'ABW'),
 ('brk_diff', 0.0),
 ('brk_group', None),
 ('brk_name', 'Aruba'),
 ('continent', 'North America'),
 ('economy', '6. Developing region'),
 ('featurecla', 'Admin-0 country'),
 ('fips_10', None),
 ('formal_en', 'Aruba'),
 ('formal_fr', None),
 ('gdp_md_est', 2258.0),
 ('gdp_year', -99.0),
 ('geou_dif', 0.0),
 ('geounit', 'Aruba'),
 ('gu_a3', 'ABW'),
 ('homepart', -99.0),
 ('income_grp', '2. High income: nonOECD'),
 ('iso_a2', 'AW'),
 ('iso_a3', 'ABW'),
 ('iso_n3', '533'),
 ('labelrank', 5.0),
 ('lastcensus', 2010.0),
 ('level', 2.0),
 ('long_len', 5.0),
 ('mapcolor13', 9.0),
 ('mapcolor7', 4.0),
 ('mapcolor8', 2.0),
 ('mapcolor9', 2.0),
 ('name', 'Aruba'),
 ('name_alt', None),
 ('name_len', 5.0),
 ('name_long', 'Aruba'),
 ('name_sort', 'Aruba'),
 ('note_adm0', 'Neth.'),
 ('note_brk', None),
 ('pop_est', 103065.0),
 ('pop_year', -99.0),
 ('postal', 'AW'),
 ('region_un', 'Americas'),
 ('region_wb', 'Latin America & Caribbean'),
 ('scalerank', 3),
 ('sov_a3', 'NL1'),
 ('sovereignt', 'Netherlands'),
 ('su_a3', 'ABW'),
 ('su_dif', 0.0),
 ('subregion', 'Caribbean'),
 ('subunit', 'Aruba'),
 ('tiny', 4.0),
 ('type', 'Country'),
 ('un_a3', '533'),
 ('wb_a2', 'AW'),
 ('wb_a3', 'ABW'),
 ('wikipedia', -99.0),
 ('woe_id', -99.0)]

If we parse the GeoJSON version of that shapefile using Python's own standard json library, we get the same data.

>>> import requests
>>> r = requests.get(
...     "https://raw.github.com/sgillies/sgillies.github.com/master/"
...     "ne-sample/ne_50m_admin_0_countries.json")
>>>
>>> r.ok
True
>>> r.json()['type'] == 'FeatureCollection'
True
>>> first = r.json()['features'][0]
>>> sorted(first['properties'].items()) == items
True

The same data, accessed in the very same way. A familiar uniform interface for both Shapefile and JSON features.

For a while I've been suggesting that GeoJSON, or rather Python mappings modeled on GeoJSON, might be as useful a protocol for passing geospatial data between Python packages as it is for passing data around the web. A few Python packages have implemented the Python Geo Interface: ArcPy, descartes, Fiona, PySAL, Shapely. Descartes can, for example, help you draw ArcPy, Fiona, PySAL, or Shapely objects; or any other object that satisfies the protocol. It's dirt-simple Python GIS interoperability based on mappings with agreed-upon items.

Yesterday, I saw Nathan Woodrow's post about monkey patching the protocol into QGIS classes. I'm happy to see this because adoption of the protocol by QGIS will be a big deal and also because it demonstrates a manner in which Esri Python users might fix ArcPy's slightly broken implementation of the protocol. In the same discussion, Calvin Metcalf confirmed for me that ArcPy feature objects also satisfy the protocol, which means that writing GeoJSON from ArcGIS using Python could, in theory, be very simple.

import json
json.dumps(
    dict(
        type='FeatureCollection',
        features=[row.__geo_interface__ for row in cursor]))

Esri's support for GeoJSON is (potentially) better than most people realize.

Comments

Re: GeoJSON and the geo interface for Python

Author: Nathan

I found a easier, and better way to attach the __geo_interface__ to QgsFeature.

QgsFeature.__geo_interface__ = property(mapping_feature)

Much nicer then the MethodType stuff and fits in nicer with what you have with Fiona.

Re: GeoJSON and the geo interface for Python

Author: Martin

The geo_interface was also implemented in Pyshp by Christian Lederman (https://github.com/cleder/pyshp)

import shapefile

def records(filename):

# generator

reader = shapefile.Reader(filename)

fields = reader.fields[1:]

field_names = [field[0] for field in fields]

for sr in reader.shapeRecords():

geom = sr.shape.__geo_interface__

atr = dict(zip(field_names, sr.record))

yield dict(geometry=geom,properties=atr)

a = records('point.shp')

>>> a.next()

{'geometry': {'type': 'Point', 'coordinates': (161821.09375, 79076.0703125)}, 'properties': {'DIP_DIR': 120, 'STRATI_TYP': 1, 'DIP': 30}}

>>> a.next()['geometry']['coordinates']

(161485.09375, 79272.34375)

>>> a.next()['properties']['DIP']

55

and you can use the same method to implement it in QGIS 2

def records(layer):

fields = layer.pendingFields()

field_names = [field.name() for field in fields]

for elem in layer.getFeatures():

geom= elem.geometry()

atr = dict(zip(field_names, elem.attributes()))

yield dict(geometry=geom.exportToGeoJSON(),properties=atr)

>>> layer = qgis.utils.iface.activeLayer()

>>> c = records(layer)

>>> c.next()

{'geometry': u'{ "type": "LineString", "coordinates": [ [164917.66073716, 115230.16694565], [170806.15565476, 116401.17445767] ] }', 'properties': {u'id': 1,u'test':u'ligne'}}

My problem is to implement it in writing as in Fiona

Fiona 0.16

Fiona now reads and writes multi-layer data. The user advocating for this feature has Esri file geodatabases in mind. I'm developing and testing with directories of shapefiles, but since I'm using OGR's datasource abstraction it (modulo the usual driver quirks) "should just work" with FileGDBs.

Here's how it works. By the way, I'm using Python 3.3.1 and a virtualenv in these examples. Fiona works equally well with Python 3.3 and 2.6-2.7. I'm not testing Python 3.0-3.2 right now, so your mileage there may vary.

(fiona)$ python
Python 3.3.1 (default, Apr 29 2013, 22:56:52)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

First, I will take a feature from the testing shapefile distributed with Fiona and copy the file's schema, driver, crs, and encoding (as a dict named 'meta').

>>> import fiona
>>> with fiona.open('docs/data/test_uk.shp') as c:
...     f = next(c)
...     meta = c.meta
...

Now, forget about that file. We're going to open a new, not previously existing collection from a multi-layer source with the same metadata as the test dataset and write that one feature to it.

>>> import tempfile
>>> path = tempfile.mkdtemp()
>>> with fiona.open(path, 'w', layer='foo', **meta) as c:
...     c.write(f)
...

Fiona collections are context managers, remember? They write to disk when the block they're defined on ends.

Here's a peek into the directory we created:

>>> import os
>>> os.listdir(path)
['foo.cpg', 'foo.dbf', 'foo.prj', 'foo.shp', 'foo.shx']

Now we'll open a collection from that source and read it to verfiy that the write went as expected. Fiona's listlayers() function tells us what layers are available.

>>> fiona.listlayers(path)
['foo']
>>> with fiona.open(path, layer='foo') as c:
...     print(len(c))
...     g = next(c)
...     print(f == g)
...
1
True

Its sole feature is identical to the one we passed to write().

The pattern for working with an Esri FileGDB is the same; just replace the directory path with the path to your .gdb file. If you try this (please do!) and have any problems, please let me know at https://github.com/Toblerity/Fiona/issues.

Downloads, etc at https://pypi.python.org/pypi/Fiona/0.16. This is pretty much it for features before 1.0. I'm going to see what kind of bugs people find in 0.16, adjust the new features as I get feedback, and beginning writing the final 1.0 manual.

Comments

Re: Fiona 0.16

Author: Waldemar

Beautiful,

Just tried reading FileGDB (on Windows) and it worked as advertised :-)

Re: Fiona 0.16

Author: Martin

worked also on Mac OS X (10.6) with a FileGDB created with GDAL/OGR or QGIS

Another good day for the little format that could

Last Thursday, Ben Balter wrote:

Not long ago, we began rendering 3D models on GitHub. Today we're excited to announce the latest addition to the visualization family - geographic data. Any .geojson file in a GitHub repository will now be automatically rendered as an interactive, browsable map, annotated with your geodata.

Wow! GitHub, MapBox, OpenStreetMap, Leaflet, and GeoJSON are five of my favorite things on the web. I can't wait to see what happens next.

Update (2013-06-25): GeoJSON previews for CKAN datasets

Update (2013-06-26): Marker clustering and TopoJSON support are what's new at GitHub

Comments

examples?

Author: Jonathan Hartley

Anyone got links to good examples of this in action?

Getting back into OpenStreetMap

The new OSM editor is a pleasure to use and I've been inspired to fix up my neighborhood a bit. It was cool to see these historic homes (designated Fort Collins landmarks, in fact) pop up in MapBox just minutes after I submitted them.

http://a.tiles.mapbox.com/v3/sgillies.map-jtvmlgox/-105.09175479412079,40.56621496147184,18/640x480.png

Google Maps, on the other hand, has no historic homes and a rather outdated and (sadly) never to be realized development proposal in the neighborhood. Fixing Google's map for free is not high on my list of priorities.

Next, I'd like to finally get some Pleiades-related data into the OpenHistoricalMap sandbox.

A good day for the little format that could

Today was a pretty good day for the GeoJSON format.

Josh Livni announced on Twitter that the Google Map Engine API had been published. The little format that could has a good role in the API.

Ed Summers blogged about wikigeo.js, a library that gives you a GeoJSON interface to Wikipedia API results. Ed is absolutely right about how usable and right for the web mapping software has become since younger web developers and designers have started to displace older earth science programmers like myself.

A good example of which is Tom MacWright's edit geojson app: draw a shape and copy the GeoJSON representation, paste some GeoJSON and render the shape.

The business of geospatial standardization may have hit a rough patch recently, but things aren't all bad. Developers are still finding ways to agree, share, and do good work.

Dumpgj: JSON-LD and CRS

The dumpgj script installed by Fiona 0.14 is among other things a vehicle for changing the way people think about GeoJSON.

There's a growing sense in the GeoJSON community that the crs object is over-engineered. It's rarely used, and those who do use it report that they would be just as happy with OGC URN strings instead of the existing objects. The new version of dumpgj adds an experimental:

"_crs": "urn:ogc:def:crs:OGC:1.3:CRS84"

to a feature collection when the input data is WGS84 longitude and latitude, as with Natural Earth shapefiles, and something like:

"_crs": "urn:ogc:def:crs:EPSG::3857"

for systems in the EPSG database (3857 is Spherical Mercator).

Additionally, dumpgj users can, with --use-ld-context, add more information to their GeoJSON files. The context object added to feature collections states more specifically what is meant by "Feature", "properties", "geometries", etc. Fiona's GeoJSON context looks like this:

{ "@context": {
  "Feature": "http://geovocab.org/spatial#Feature",
  "FeatureCollection": "_:n1",
  "GeometryCollection": "http://geovocab.org/geometry#GeometryCollection",
  "LineString": "http://geovocab.org/geometry#LineString",
  "MultiLineString": "http://geovocab.org/geometry#MultiLineString",
  "MultiPoint": "http://geovocab.org/geometry#MultiPoint",
  "MultiPolygon": "http://geovocab.org/geometry#MultiPolygon",
  "Point": "http://geovocab.org/geometry#Point",
  "Polygon": "http://geovocab.org/geometry#Polygon",
  "_crs": {
    "@id": "_:n2",
    "@type": "@id"
  },
  "bbox": "http://geovocab.org/geometry#bbox",
  "coordinates": "_:n5",
  "features": "_:n3",
  "geometry": "http://geovocab.org/geometry#geometry",
  "id": "@id",
  "properties": "_:n4",
  "type": "@type"
} }

I'm currently mapping GeoJSON's "id" and "type" to JSON-LD keywords, some other items to the GeoVocab spatial and geometry vocabularies, and giving the rest blank node identifiers until the GeoJSON community defines a namespace and permanent identifiers for them.

GeoJSON feature properties are the things that I think stand to benefit the most from a JSON-LD context. If a feature has a "title" property, is that an honorific like "Mr." or "Duchess of York"? Is it "NBA Championship"? Or something else? The dumpgj script has a --add-ld-context-item option that can be used to nail a property down a little more tightly. For example,

$ dumpgj --add-ld-context-item "title=http://purl.org/dc/elements/1.1/title" ...

adds:

"title": "http://purl.org/dc/elements/1.1/title",

to the feature collection's JSON-LD context. This is what "title" usually means in my applications.

Lastly, dumpgj will build the entire GeoJSON object in memory by default, but has a more economical --record-buffered option that only builds feature records in memory and writes them immediately to the output stream.

Fiona 0.13

I've made the jump to developing Fiona primarily for Python 3.3 and Fiona 0.13 is the first release for Python 2.6, 2.7, or 3.3. Thanks to Cython, I'm able to offer Python 2/3 compatibility in a single source distribution.

The fiona.tool script I wrote about earlier is now called dumpgj.

$ dumpgj -h
usage: python -mfiona.tool [-h] [-d] [-n N] [--compact] [--encoding ENC]
                           [--record-buffered] [--ignore-errors]
                           infile [outfile]

Serialize a file's records or description to GeoJSON

positional arguments:
  infile             input file name
  outfile            output file name, defaults to stdout if omitted

optional arguments:
  -h, --help         show this help message and exit
  -d, --description  serialize file's data description (schema) only
  -n N, --indent N   indentation level in N number of chars
  --compact          use compact separators (',', ':')
  --encoding ENC     Specify encoding of the input file
  --record-buffered  buffer writes at record, not collection (default), level
  --ignore-errors    log errors but do not stop serialization

BTW, Comments are off for a while until these Chinese spammers move along.

Fiona makes reading and writing data boring

One of my goals for Fiona is to make it the most boring and predictable package for reading and writing geo data ever. I want it to refute the notion that "spatial is special" and have no gotchas, no surprises. As a benchmark for predictability and dullness consider fiona.tool, a minimal and limited replacement for ogrinfo and ogr2ogr inspired by Python's json.tool. You give it a data file name and write a description to a file or stdout

$ python -mfiona.tool -d docs/data/test_uk.shp

Description of source: <open Collection 'docs/data/test_uk.shp:test_uk', mode
'r' at 0x2038b10>

Coordinate reference system (source.crs):
{'datum': 'WGS84', 'no_defs': True, 'proj': 'longlat'}

Format driver (source.driver):
'ESRI Shapefile'

Data description (source.schema):
{'geometry': 'Polygon',
 'properties': {u'AREA': 'float',
                u'CAT': 'float',
                u'CNTRY_NAME': 'str',
                u'FIPS_CNTRY': 'str',
                u'POP_CNTRY': 'float'}}

or convert the file's data to GeoJSON.

$ python -mfiona.tool --indent 2 docs/data/test_uk.shp /tmp/test_uk.json
$ head /tmp/test_uk.json
{
  "type": "FeatureCollection",
  "features": [
    {
      "geometry": null,
      "id": "0",
      "properties": {
        "POP_CNTRY": 60270708.0,
        "CNTRY_NAME": "United Kingdom",
        "AREA": 244820.0,

Most of the source is devoted to parsing arguments and formatting the descriptions. The core of it is this:

# The Fiona data tool.

if __name__ == '__main__':

    # ...
    args = parser.parse_args()

    with sys.stdout as sink:

        with fiona.open(args.infile, 'r') as source:

            if args.description:
                meta = source.meta.copy()
                meta.update(name=args.infile)
                if args.json:
                    sink.write(json.dumps(meta, indent=args.indent))
                else:
                    sink.write("\nDescription of source: %r" % source)
                    print("\nCoordinate reference system (source.crs):")
                    pprint.pprint(meta['crs'], stream=sink)
                    # ...
            else:
                collection = {'type': 'FeatureCollection'}
                collection['features'] = list(source)
                sink.write(json.dumps(collection, indent=args.indent))

Does that underwhelm you? Are you asking "is that all there is?" Good. It underwhelms me, too. All there really is to Fiona is an open function that returns an iterable object that you can treat as a limited file or io stream and plain old Python dicts. No layers, no features, no shadow classes and reference counting. Just concepts that Python programmers already understand. Files, iterators, and dicts. Nothing special.

By the way, how about that use of sys.stdout as a context manager? I expected it wouldn't be a problem, and it isn't. The Python language keeps getting better and better. I'm developing Fiona for Python 2.7 right now and experimenting with a full switch to 3.3, with backported releases for 2.7 and 2.6.

Comments

stdout as a context manager

Author: Jonathan Hartley

Hey,

Lovely post as always, but I'm puzzled by the use of stdout as a context manager, and my google-fu apparently isn't up to it this early in the morning (lots of results for using context managers to redirect stdout)

I can see from here (http://docs.python.org/2/library/stdtypes.html#file.close) that File objects used as context managers get closed when exiting the block, so I expect that you're closing (and hence presumably flushing) stdout when you leave the outermost 'with'. But this is the end of the program, so this would happen anyway.

So why are you doing it? Thanks.

Re: Fiona makes reading and writing data boring

Author: Sean

What I didn't point out is that in the actual code, the sink (output stream) could be either stdout or a Fiona Collection object depending on the parsed script args. That they both work as context managers lets me write less code. The reasons I'm using ``with`` instead of letting the file close at garbage collection time are that it future-proofs my script a bit against code I may add after the features are read and written and that I'm planning to hang some OGR state on the Fiona Collection soon and want to use __enter__ and __exit__ to manage it reliably.

April Snowstorm

The snow finally stopped this afternoon. I didn't measure like I feel like I should have, but I read that we received at least 50 cm of snow. That's unconsolidated snow depth. It's fairly dense and has now settled to a foot of firm snow. My youngest daughter lost a boot in the snow this morning and we were unable to find it. It'll show up in a week or so, I suppose.

http://farm9.staticflickr.com/8114/8659853276_73897c5e55_z_d.jpg

We desperately needed this water, but I'm still a bit bummed about our salad greens and peas and almost-ready-to-bloom tulips crushed flat under the snow. We're fortunate that trees hadn't begun to leaf out yet and didn't collect enough snow or ice to have their branches broken.

A neat surprise at the end of the afternoon was finding a large mixed flock of Robin and Mountain Bluebird a few houses down the street. I've only seen a couple bluebirds in Fort Collins before, never an entire flock. What a time they picked to head to the hills.

http://farm9.staticflickr.com/8120/8658753315_2ceec17de3_z_d.jpg

The light was awkward and I'm a rank amateur photographer so I only captured just a fraction of the otherwordly blueness of these beautiful birds.