2013 (old posts, page 2)

Fiona 1.0

At last, 1.0: https://pypi.python.org/pypi/Fiona/1.0.

Fiona is OGR's neat, nimble, no tears API. It is a Python library, not a GIS library, and is designed for Python programmers who appreciate:

  • Simplicity and less code.
  • Familiar Python types and protocols like files, dicts, and iterators instead of classes specific to GIS.
  • GeoJSON style feature records.
  • Reading and writing single and multi-layer files.
  • Reading zipped data, too.
  • A handy command line tool that upgrades "ogr2ogr -f GeoJSON".
  • Comprehensive tests.
  • 15 pages of narrative documentation.

I've had lots of help getting to this stage. Thanks, everybody!

The name? At first it was a Shrek reference, but now it's just a probably too cute but hopefully not too annoying recursive bacronym.

Share and enjoy this Fiona Deluxe Professional Home Enterprise Edition 1.0.

Box hugger no more

I've been using zc.buildout for years now to build the Pleiades infrastructure in a replicable way, and it's served the project well. The process of setting up a production environment has been this:

  1. ssh in to production
  2. git clone https://github.com/isawnyu/pleiades3-buildout $HOME
  3. cd $HOME
  4. virtualenv
  5. bin/python bootstrap.py
  6. bin/buildout install zope2
  7. bin/buildout
  8. Copy database to $HOME/var
  9. Symlink $HOME to pleiades-production
  10. supervisor restart all

The buildout script yields a database server, multiple app servers, an nginx load balancer, and Varnish caching proxy. Subsequent minor site releases are just a matter of pulling buildout configuration files and re-running step 6. Buildout is repeatable and, with care, idempotent. But the other steps are less repeatable and more manual. That I use screen to keep that shell alive shows exactly how much manual intervention is required. Although I've automated half the process, I've been the kind of devops person Subbu Allamaraju calls a "box hugger". Subbu says:

Two steps to cure box hugging – first, internalize the idea that the box you’ve just finished setting up meticulously is going to burst into flames the very next minute, second treat operations the same way as you would treat software development.

I'm well on the second step: all the Pleiades buildout configuration is version controlled, if not fully tested. But the first step, not so much.

I'm using a new project to reform my ways and be much less of a box hugger. To internalize the ephemeral nature of servers, I'm teaching myself to provision and configure Vagrant VMs with Ansible. My goal is to be able to deploy this project's sites to their production server using only Ansible, never logging in at all. Having no database in this project (it's all based on XML, on GitHub) makes this goal easier to hit. And it turns out that setting up Solr isn't going to be too tough, either. Thanks to this ansible-multi-solr project I've learned to write my own very basic Solr and Tomcat playbook. With just two commands

$ vagrant up
$ ansible-playbook setup.yml -k -i setup_hosts

and a short wait, I get a running Solr instance at http://192.168.35.10:8983/solr/.

I'm late to the party, I know, but Vagrant is killer. I'm also using it to test packaging and installation of Shapely and Fiona. I believe it was Whit Morris who directed me to Vagrant. Thanks, Whit!

I've used neither Chef or Puppet and chose Ansible because it's Python, uses familiar stuff like SSH and JSON, and because the playbook concept is a reasonable leap for me from Buildout. I'm enjoying it very much and hope to be able to contribute something to the project in time.

Thanks, Subbu, for providing the impetus I needed to make the leap from box hugging! I really would rather be developing than deploying and administering, and feel like I'm beginning to get a grip on the tools that will make that possible.

Comments

Re: Box hugger no more

Author: Michael Weisman

Throw jenkins and a few lines of bash in there and you can have production or dev servers auto deploy on git commits to specific branches using the same ansible scripts you use with vagrant for local dev!

Re: Box hugger no more

Author: Sean

Yes, sky's the limit! I didn't realize you were Ansible users at OpenGeo.

Re: Box hugger no more

Author: Kenshi

There is Salt Stack, which is also Python and gaining a lot of traction. http://saltstack.com/community.html

GeoJSON and the geo interface for Python

It's pretty clear that GeoJSON is good for the web. I think it's also good for Python programmers in two different ways.

Fiona employs Python mappings modeled on GeoJSON to represent GIS features and their geometries. By reusing these existing concepts – the GeoJSON data model and the Python mapping type – Fiona keeps its mental footprint small. There's less to memorize, less need to check documentation, fewer gotchas. There's also less code and less room for bugs. For example, given a Fiona feature, there is no need to look up the classes and methods for accessing the feature's non-geometry fields, it's just feature['properties'].items().

>>> import fiona
>>> with fiona.open(
...         "/Users/seang/data/ne_50m_admin_0_countries/"
...         "ne_50m_admin_0_countries.shp") as c:
...     first = next(c)
...
>>> items = sorted(first['properties'].items())
>>> import pprint
>>> pprint.pprint(items)
[('abbrev', 'Aruba'),
 ('abbrev_len', 5.0),
 ('adm0_a3', 'ABW'),
 ('adm0_a3_is', 'ABW'),
 ('adm0_a3_un', -99.0),
 ('adm0_a3_us', 'ABW'),
 ('adm0_a3_wb', -99.0),
 ('adm0_dif', 1.0),
 ('admin', 'Aruba'),
 ('brk_a3', 'ABW'),
 ('brk_diff', 0.0),
 ('brk_group', None),
 ('brk_name', 'Aruba'),
 ('continent', 'North America'),
 ('economy', '6. Developing region'),
 ('featurecla', 'Admin-0 country'),
 ('fips_10', None),
 ('formal_en', 'Aruba'),
 ('formal_fr', None),
 ('gdp_md_est', 2258.0),
 ('gdp_year', -99.0),
 ('geou_dif', 0.0),
 ('geounit', 'Aruba'),
 ('gu_a3', 'ABW'),
 ('homepart', -99.0),
 ('income_grp', '2. High income: nonOECD'),
 ('iso_a2', 'AW'),
 ('iso_a3', 'ABW'),
 ('iso_n3', '533'),
 ('labelrank', 5.0),
 ('lastcensus', 2010.0),
 ('level', 2.0),
 ('long_len', 5.0),
 ('mapcolor13', 9.0),
 ('mapcolor7', 4.0),
 ('mapcolor8', 2.0),
 ('mapcolor9', 2.0),
 ('name', 'Aruba'),
 ('name_alt', None),
 ('name_len', 5.0),
 ('name_long', 'Aruba'),
 ('name_sort', 'Aruba'),
 ('note_adm0', 'Neth.'),
 ('note_brk', None),
 ('pop_est', 103065.0),
 ('pop_year', -99.0),
 ('postal', 'AW'),
 ('region_un', 'Americas'),
 ('region_wb', 'Latin America & Caribbean'),
 ('scalerank', 3),
 ('sov_a3', 'NL1'),
 ('sovereignt', 'Netherlands'),
 ('su_a3', 'ABW'),
 ('su_dif', 0.0),
 ('subregion', 'Caribbean'),
 ('subunit', 'Aruba'),
 ('tiny', 4.0),
 ('type', 'Country'),
 ('un_a3', '533'),
 ('wb_a2', 'AW'),
 ('wb_a3', 'ABW'),
 ('wikipedia', -99.0),
 ('woe_id', -99.0)]

If we parse the GeoJSON version of that shapefile using Python's own standard json library, we get the same data.

>>> import requests
>>> r = requests.get(
...     "https://raw.github.com/sgillies/sgillies.github.com/master/"
...     "ne-sample/ne_50m_admin_0_countries.json")
>>>
>>> r.ok
True
>>> r.json()['type'] == 'FeatureCollection'
True
>>> first = r.json()['features'][0]
>>> sorted(first['properties'].items()) == items
True

The same data, accessed in the very same way. A familiar uniform interface for both Shapefile and JSON features.

For a while I've been suggesting that GeoJSON, or rather Python mappings modeled on GeoJSON, might be as useful a protocol for passing geospatial data between Python packages as it is for passing data around the web. A few Python packages have implemented the Python Geo Interface: ArcPy, descartes, Fiona, PySAL, Shapely. Descartes can, for example, help you draw ArcPy, Fiona, PySAL, or Shapely objects; or any other object that satisfies the protocol. It's dirt-simple Python GIS interoperability based on mappings with agreed-upon items.

Yesterday, I saw Nathan Woodrow's post about monkey patching the protocol into QGIS classes. I'm happy to see this because adoption of the protocol by QGIS will be a big deal and also because it demonstrates a manner in which Esri Python users might fix ArcPy's slightly broken implementation of the protocol. In the same discussion, Calvin Metcalf confirmed for me that ArcPy feature objects also satisfy the protocol, which means that writing GeoJSON from ArcGIS using Python could, in theory, be very simple.

import json
json.dumps(
    dict(
        type='FeatureCollection',
        features=[row.__geo_interface__ for row in cursor]))

Esri's support for GeoJSON is (potentially) better than most people realize.

Comments

Re: GeoJSON and the geo interface for Python

Author: Nathan

I found a easier, and better way to attach the __geo_interface__ to QgsFeature.

QgsFeature.__geo_interface__ = property(mapping_feature)

Much nicer then the MethodType stuff and fits in nicer with what you have with Fiona.

Re: GeoJSON and the geo interface for Python

Author: Martin

The geo_interface was also implemented in Pyshp by Christian Lederman (https://github.com/cleder/pyshp)

import shapefile

def records(filename):

# generator

reader = shapefile.Reader(filename)

fields = reader.fields[1:]

field_names = [field[0] for field in fields]

for sr in reader.shapeRecords():

geom = sr.shape.__geo_interface__

atr = dict(zip(field_names, sr.record))

yield dict(geometry=geom,properties=atr)

a = records('point.shp')

>>> a.next()

{'geometry': {'type': 'Point', 'coordinates': (161821.09375, 79076.0703125)}, 'properties': {'DIP_DIR': 120, 'STRATI_TYP': 1, 'DIP': 30}}

>>> a.next()['geometry']['coordinates']

(161485.09375, 79272.34375)

>>> a.next()['properties']['DIP']

55

and you can use the same method to implement it in QGIS 2

def records(layer):

fields = layer.pendingFields()

field_names = [field.name() for field in fields]

for elem in layer.getFeatures():

geom= elem.geometry()

atr = dict(zip(field_names, elem.attributes()))

yield dict(geometry=geom.exportToGeoJSON(),properties=atr)

>>> layer = qgis.utils.iface.activeLayer()

>>> c = records(layer)

>>> c.next()

{'geometry': u'{ "type": "LineString", "coordinates": [ [164917.66073716, 115230.16694565], [170806.15565476, 116401.17445767] ] }', 'properties': {u'id': 1,u'test':u'ligne'}}

My problem is to implement it in writing as in Fiona

Fiona 0.16

Fiona now reads and writes multi-layer data. The user advocating for this feature has Esri file geodatabases in mind. I'm developing and testing with directories of shapefiles, but since I'm using OGR's datasource abstraction it (modulo the usual driver quirks) "should just work" with FileGDBs.

Here's how it works. By the way, I'm using Python 3.3.1 and a virtualenv in these examples. Fiona works equally well with Python 3.3 and 2.6-2.7. I'm not testing Python 3.0-3.2 right now, so your mileage there may vary.

(fiona)$ python
Python 3.3.1 (default, Apr 29 2013, 22:56:52)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

First, I will take a feature from the testing shapefile distributed with Fiona and copy the file's schema, driver, crs, and encoding (as a dict named 'meta').

>>> import fiona
>>> with fiona.open('docs/data/test_uk.shp') as c:
...     f = next(c)
...     meta = c.meta
...

Now, forget about that file. We're going to open a new, not previously existing collection from a multi-layer source with the same metadata as the test dataset and write that one feature to it.

>>> import tempfile
>>> path = tempfile.mkdtemp()
>>> with fiona.open(path, 'w', layer='foo', **meta) as c:
...     c.write(f)
...

Fiona collections are context managers, remember? They write to disk when the block they're defined on ends.

Here's a peek into the directory we created:

>>> import os
>>> os.listdir(path)
['foo.cpg', 'foo.dbf', 'foo.prj', 'foo.shp', 'foo.shx']

Now we'll open a collection from that source and read it to verfiy that the write went as expected. Fiona's listlayers() function tells us what layers are available.

>>> fiona.listlayers(path)
['foo']
>>> with fiona.open(path, layer='foo') as c:
...     print(len(c))
...     g = next(c)
...     print(f == g)
...
1
True

Its sole feature is identical to the one we passed to write().

The pattern for working with an Esri FileGDB is the same; just replace the directory path with the path to your .gdb file. If you try this (please do!) and have any problems, please let me know at https://github.com/Toblerity/Fiona/issues.

Downloads, etc at https://pypi.python.org/pypi/Fiona/0.16. This is pretty much it for features before 1.0. I'm going to see what kind of bugs people find in 0.16, adjust the new features as I get feedback, and beginning writing the final 1.0 manual.

Comments

Re: Fiona 0.16

Author: Waldemar

Beautiful,

Just tried reading FileGDB (on Windows) and it worked as advertised :-)

Re: Fiona 0.16

Author: Martin

worked also on Mac OS X (10.6) with a FileGDB created with GDAL/OGR or QGIS

Another good day for the little format that could

Last Thursday, Ben Balter wrote:

Not long ago, we began rendering 3D models on GitHub. Today we're excited to announce the latest addition to the visualization family - geographic data. Any .geojson file in a GitHub repository will now be automatically rendered as an interactive, browsable map, annotated with your geodata.

Wow! GitHub, MapBox, OpenStreetMap, Leaflet, and GeoJSON are five of my favorite things on the web. I can't wait to see what happens next.

Update (2013-06-25): GeoJSON previews for CKAN datasets

Update (2013-06-26): Marker clustering and TopoJSON support are what's new at GitHub

Comments

examples?

Author: Jonathan Hartley

Anyone got links to good examples of this in action?

Getting back into OpenStreetMap

The new OSM editor is a pleasure to use and I've been inspired to fix up my neighborhood a bit. It was cool to see these historic homes (designated Fort Collins landmarks, in fact) pop up in MapBox just minutes after I submitted them.

http://a.tiles.mapbox.com/v3/sgillies.map-jtvmlgox/-105.09175479412079,40.56621496147184,18/640x480.png

Google Maps, on the other hand, has no historic homes and a rather outdated and (sadly) never to be realized development proposal in the neighborhood. Fixing Google's map for free is not high on my list of priorities.

Next, I'd like to finally get some Pleiades-related data into the OpenHistoricalMap sandbox.

A good day for the little format that could

Today was a pretty good day for the GeoJSON format.

Josh Livni announced on Twitter that the Google Map Engine API had been published. The little format that could has a good role in the API.

Ed Summers blogged about wikigeo.js, a library that gives you a GeoJSON interface to Wikipedia API results. Ed is absolutely right about how usable and right for the web mapping software has become since younger web developers and designers have started to displace older earth science programmers like myself.

A good example of which is Tom MacWright's edit geojson app: draw a shape and copy the GeoJSON representation, paste some GeoJSON and render the shape.

The business of geospatial standardization may have hit a rough patch recently, but things aren't all bad. Developers are still finding ways to agree, share, and do good work.

Dumpgj: JSON-LD and CRS

The dumpgj script installed by Fiona 0.14 is among other things a vehicle for changing the way people think about GeoJSON.

There's a growing sense in the GeoJSON community that the crs object is over-engineered. It's rarely used, and those who do use it report that they would be just as happy with OGC URN strings instead of the existing objects. The new version of dumpgj adds an experimental:

"_crs": "urn:ogc:def:crs:OGC:1.3:CRS84"

to a feature collection when the input data is WGS84 longitude and latitude, as with Natural Earth shapefiles, and something like:

"_crs": "urn:ogc:def:crs:EPSG::3857"

for systems in the EPSG database (3857 is Spherical Mercator).

Additionally, dumpgj users can, with --use-ld-context, add more information to their GeoJSON files. The context object added to feature collections states more specifically what is meant by "Feature", "properties", "geometries", etc. Fiona's GeoJSON context looks like this:

{ "@context": {
  "Feature": "http://geovocab.org/spatial#Feature",
  "FeatureCollection": "_:n1",
  "GeometryCollection": "http://geovocab.org/geometry#GeometryCollection",
  "LineString": "http://geovocab.org/geometry#LineString",
  "MultiLineString": "http://geovocab.org/geometry#MultiLineString",
  "MultiPoint": "http://geovocab.org/geometry#MultiPoint",
  "MultiPolygon": "http://geovocab.org/geometry#MultiPolygon",
  "Point": "http://geovocab.org/geometry#Point",
  "Polygon": "http://geovocab.org/geometry#Polygon",
  "_crs": {
    "@id": "_:n2",
    "@type": "@id"
  },
  "bbox": "http://geovocab.org/geometry#bbox",
  "coordinates": "_:n5",
  "features": "_:n3",
  "geometry": "http://geovocab.org/geometry#geometry",
  "id": "@id",
  "properties": "_:n4",
  "type": "@type"
} }

I'm currently mapping GeoJSON's "id" and "type" to JSON-LD keywords, some other items to the GeoVocab spatial and geometry vocabularies, and giving the rest blank node identifiers until the GeoJSON community defines a namespace and permanent identifiers for them.

GeoJSON feature properties are the things that I think stand to benefit the most from a JSON-LD context. If a feature has a "title" property, is that an honorific like "Mr." or "Duchess of York"? Is it "NBA Championship"? Or something else? The dumpgj script has a --add-ld-context-item option that can be used to nail a property down a little more tightly. For example,

$ dumpgj --add-ld-context-item "title=http://purl.org/dc/elements/1.1/title" ...

adds:

"title": "http://purl.org/dc/elements/1.1/title",

to the feature collection's JSON-LD context. This is what "title" usually means in my applications.

Lastly, dumpgj will build the entire GeoJSON object in memory by default, but has a more economical --record-buffered option that only builds feature records in memory and writes them immediately to the output stream.

Fiona 0.13

I've made the jump to developing Fiona primarily for Python 3.3 and Fiona 0.13 is the first release for Python 2.6, 2.7, or 3.3. Thanks to Cython, I'm able to offer Python 2/3 compatibility in a single source distribution.

The fiona.tool script I wrote about earlier is now called dumpgj.

$ dumpgj -h
usage: python -mfiona.tool [-h] [-d] [-n N] [--compact] [--encoding ENC]
                           [--record-buffered] [--ignore-errors]
                           infile [outfile]

Serialize a file's records or description to GeoJSON

positional arguments:
  infile             input file name
  outfile            output file name, defaults to stdout if omitted

optional arguments:
  -h, --help         show this help message and exit
  -d, --description  serialize file's data description (schema) only
  -n N, --indent N   indentation level in N number of chars
  --compact          use compact separators (',', ':')
  --encoding ENC     Specify encoding of the input file
  --record-buffered  buffer writes at record, not collection (default), level
  --ignore-errors    log errors but do not stop serialization

BTW, Comments are off for a while until these Chinese spammers move along.