Crazy Legs Trail Run

https://farm8.staticflickr.com/7694/17899867135_d1a98948a3_c_d.jpg

Sunday morning Ruth and I got up again at dawn to race, this time at Larimer County's Devil's Backbone Open Space west of Loveland. This 10+ kilometer trail run, organized for the last 8 years by Paul Stoyko, reminded me very much of the ultimate (frisbee) tournaments I played in the olden days: low key, low tech, high enthusiasm. It was an out and back route (map below), taking the left hand side of the Wild, Hunter, and Laughing Horse loops along the way. The final loop (at the top of the hills in the photo above) was pretty tough: 500 feet above the start and lots of ups and downs over fractured slickrock ledges.

I finished 24th out of 96 with a time of 1:05:10. Ruth finished a few minutes after me in 31st place. Here we are holding the popsicle sticks we grabbed at the finish line. Old school!

https://farm8.staticflickr.com/7745/17900375501_e2ea41e7c5_c_d.jpg

I've driven by Devil's Backbone many times but had never been to the trailhead or up the trail before. It's beautiful and wild(ish) and the trail network extends all the way to Horsetooth Mountain Park. Foothill wildflowers are starting to kick off right now and there were blue Penstemon (P. virens) and Britton's Skullcap all along the trail.

https://farm7.staticflickr.com/6053/5891945647_07bce95393_z_d.jpg

Skullcap by Carolannie

Thanks for putting this race together, Paul. We'll be back.

Running

Sunday, May 3rd, I completed my first ever half marathon. The Colorado Marathon starts up the Poudre (pronounced "Poo-der") Canyon and follows the Cache La Poudre River into the Old Town of Fort Collins. State Highway 14 was closed upstream of Ted's Place (intersection with US 287) during the run and we rode chartered buses from the City bus terminal to the starting line before dawn.

My time: 1:52:41. 13 minutes faster than my final 13 mile training run! I think I may have become a runner. I'm hardly sore at all today and am looking forward to splashing down the trail again tomorrow.

Tracking my training with a mobile app has been surprisingly fun. For no reason other than that it's a Mapbox customer, I've been using Runkeeper. I like it. The charts make sense, the data export works, and it is perfectly adequate for tracking mileage. I can't say how useful it is for real sports physiology. Since February 15, when I started training for the race, I've run 215 miles.

Next race: a 10k on the Devil's Backbone Trail outside Loveland, May 17. I'm looking forward to more rocks and dirt and less pavement.

Fiona, Rasterio, Shapely binary wheels for OS X

Numpy and SciPy binaries for OS X have been up on PyPI for a few months and I've recently figured out how to do the same for Fiona, Rasterio, and Shapely. As the SciPy developers do, I've used delocate-wheel to (see its README):

  • find dynamic libraries imported from python extensions

  • copy needed dynamic libraries to directory within package

  • update OSX install_names and rpath to cause code to load from copies of libraries

The new Fiona and Rasterio binaries are beefy (14MB) because they include the non-standard libraries that enable format translation, cartographic projection, and computational geometry operations:

$ delocate-listdeps ~/code/frs-wheel-builds/dist/rasterio-0.17.1-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
@loader_path/.dylibs/libgdal.1.dylib
@loader_path/libgeos-3.4.2.dylib
@loader_path/libgeos_c.1.dylib
@loader_path/libjasper.1.0.0.dylib
@loader_path/libjson-c.2.dylib
@loader_path/libproj.0.dylib

For the small price of a larger download, Mac users now get batteries-included binaries that work immediately. No XCode required. Just pip install rasterio and start using it.

The new binaries are built on 10.9 using Python 2.7.9 and 3.4.2 downloaded from python.org. These Pythons were compiled using the 10.6 SDK for both i386 and x86_64 architectures and I've similarly set MACOSX_DEPLOYMENT_TARGET=10.6 and -arch i386 -arch x86_64 in my own builds. In practice they are intended for 10.9 and 10.10, but will probably work on 10.7 and 10.8. They should work for just about any OS X Python, whether from the system, Homebrew, MacPorts, or python.org.

If you'd rather continue to compile, e.g, Rasterio's modules using your own GDAL installation, you've got an out in pip's --no-use-wheel option:

$ GDAL_CONFIG=/path/to/gdal-config pip install --no-use-wheel rasterio

To contribute to development of these binaries or report installation bugs, please head over to https://github.com/sgillies/frs-wheel-builds. Most importantly, help me spread the word that installation of Fiona, Rasterio, and Shapely on OS X is easier than ever.

Rasterio 0.15 and a cheat sheet

Here is what's new in Rasterio 0.15. The biggest changes are the ones under the hood to permit opening non-TIFF formats in 'r+' and 'w' modes. The one API change was made to align better with Numpy: any output keyword args are superceded by out and we warn you about future removal of output. In the command line programs we're adding -f and --format as preferred aliases for the older --driver option. We're closing in on the programming and command line interfaces that will be finalized in 1.0.

Inspired by Derek Watkins, I've begun a Fiona/Rasterio/Shapely cheat sheet modeled after his popular GDAL/OGR command line cheat sheet. It's been a great rubric for identifying the key features that should be in the Fiona and Rasterio CLIs. It also has fun examples of using fio and rio with GNU Parallel, jq, and geojsonio-cli.

$ fio cat input.shp --x-json-seq-no-rs \
> | parallel --pipe "jq -c 'select(.id==\"10\")'" \
> | fio collect \
> | geojsonio

0.15 features in the cheat sheet include version inspection,

$ rio --version
0.15

format driver enumeration,

$ rio env --formats
AAIGrid: Arc/Info ASCII Grid
ACE2: ACE2
ADRG: ARC Digitized Raster Graphics
AIG: Arc/Info Binary Grid
ARG: Azavea Raster Grid format
AirSAR: AirSAR Polarimetric Image
...
ZMap: ZMap Plus Grid

and stacking raster bands to produce new multiband datasets.

$ rio stack tests/data/RGB.byte.tif --bidx 1..3 -o stacked.jpg -f JPEG

Unix style spatial ETL with fio cat, collect, and load

In Fiona 1.4.0 I added a fio-cat command to the CLI which works much UNIX cat. It opens one or more vector datastets, concatenating their features and printing them to stdout as a sequence of GeoJSON features.

$ fio cat docs/data/test_uk.shp | head -n 2
{"geometry": {"coordinates": [...], "type": "Polygon"}, "id": "0", "properties": {"AREA": 244820.0, "CAT": 232.0, "CNTRY_NAME": "United Kingdom", "FIPS_CNTRY": "UK", "POP_CNTRY": 60270708.0}, "type": "Feature"}
{"geometry": {"coordinates": [...], "type": "Polygon"}, "id": "1", "properties": {"AREA": 244820.0, "CAT": 232.0, "CNTRY_NAME": "United Kingdom", "FIPS_CNTRY": "UK", "POP_CNTRY": 60270708.0}, "type": "Feature"}

I've replaced most of the coordinates with ellipses to save space in the code block above, something I'll continue to do in examples below.

I said that fio-cat concatenates features of multiple files and you can see this by using wc -l.

$ fio cat docs/data/test_uk.shp | wc -l
      48
$ fio cat docs/data/test_uk.shp docs/data/test_uk.shp | wc -l
      96

If you look closely at the output, you'll see that every GeoJSON feature is a standalone text and each is preceded by an ASCII RS (0x1E) control character. These allow you to cat pretty-printed GeoJSON (using the --indent option) containing newlines that can still be understood as a sequence of texts by other programs. Software like Python's json module and Node's underscore-cli will trip over unstripped RS, so you can disable the RS control characters and emit LF delimited sequences of GeoJSON (with no option to pretty print, of course) using --x-json-seq-no-rs.

To complement fio-cat I've written fio-load and fio-collect. They read features from a sequence (RS or LF delimited) and respectively write them to a formatted vector file (such as a Shapefile) or print them as a GeoJSON feature collection.

Here's an example of using fio-cat and load together. You should tell fio-load what coordinate reference system to use when writing the output file because that information isn't carried in the GeoJSON features written by fio-cat.

$ fio cat docs/data/test_uk.shp \
| fio load --driver Shapefile --dst_crs EPSG:4326 /tmp/test_uk.shp
$ ls -l /tmp/test_uk.*
-rw-r--r--  1 seang  wheel     10 Oct  5 10:09 /tmp/test_uk.cpg
-rw-r--r--  1 seang  wheel  11377 Oct  5 10:09 /tmp/test_uk.dbf
-rw-r--r--  1 seang  wheel    143 Oct  5 10:09 /tmp/test_uk.prj
-rw-r--r--  1 seang  wheel  65156 Oct  5 10:09 /tmp/test_uk.shp
-rw-r--r--  1 seang  wheel    484 Oct  5 10:09 /tmp/test_uk.shx

And here's one of fio-cat and collect.

$ fio cat docs/data/test_uk.shp | fio collect --indent 4 | head
{
    "features": [
        {
            "geometry": {
                "coordinates": [
                    [
                        [
                            0.899167,
                            51.357216
                        ],
$ fio cat docs/data/test_uk.shp | fio collect --indent 4 | tail
                "CAT": 232.0,
                "CNTRY_NAME": "United Kingdom",
                "FIPS_CNTRY": "UK",
                "POP_CNTRY": 60270708.0
            },
            "type": "Feature"
        }
    ],
    "type": "FeatureCollection"
}

Does it look like I've simply reinvented ogr2ogr? The difference is that with fio-cat and fio-load there's space in between for programs that process features. The programs could be written in any language. They might use Shapely, they might use Turf. The only requirement is that they read and write sequences of GeoJSON features using stdin and stdout. A nice property of programs like these is that you can sometimes parallelize them cheaply using GNU parallel.

The fio-buffer program (unreleased) in the example below uses Shapely to calculate a 100 km buffer around features (in Web Mercator, I know!). Parallel doesn't help in this example because the sequence of features from fio-cat is fairly small, but I want to show you how to tell parallel to watch for RS as a record separator.

$ fio cat docs/data/test_uk.shp --dst_crs EPSG:3857 \
> | parallel --pipe --recstart '\x1E' fio buffer 1E+5 \
> | fio collect --src_crs EPSG:3857 \
> | geojsonio

Here's the result. Unix pipelines, still awesome at the age of 41!

The other point of this post is that, with the JSON Text Sequence draft apparently going to publication, sequences of GeoJSON features not collected into a GeoJSON feature collection are very close to being a real thing that developers should be supporting.

Python at FOSS4G 2014

There were plenty of other Python talks at FOSS4G and I plan to watch them when the videos are online (update: talks are appearing now at http://vimeo.com/foss4g). I haven't been aware of ogrtools, which is unlucky because there's plenty of functional overlap between it and Fiona. The designs seem rather different because Fiona doesn't emulate XML tool chains (GDAL's VRTs are not unlike XSLT) and is more modular. For example, where ogrtools has a file-to-file ogr translate command, Fiona has a fio dump and fio load pair connected by a stream of GeoJSON objects. The ogrtools talk is right near the top of my list of talks to see.

I was very fortunate to go right after Mike Bostock's keynote. It got people thinking about tools and design, and that's exactly the conversation that I'm trying to engage developers in with Fiona and Rasterio, if with less insight and perspective than Mike. I reminded attendees that the best features of our day-to-day programming languages are sometimes disjoint and showed this diagram (in which C is yellow, Javascript is magenta, and Python is blue. By "GC" I mean garbage collection and by "{};" I mean extraneous syntax).

https://sgillies.github.io/foss4g-2014-fiona-rasterio/img/py-js-c.png

D3 embraces browser standards and all they entail (a world wide knowledge base and continuous performance improvements) and Fiona and Rasterio embrace the good parts of Python. Written as C, like we usually see in GDAL/OGR examples on the web, Python is quite slow. Idiomatic Python, including the good parts like list comprehensions, generators, and iterators, is dramatically faster. While Fiona and Rasterio don't do particular operations faster than the older GDAL and OGR bindings (because it's the same C library underneath), they are designed from the bottom up for a good fit with more efficient idiomatic Python code.

I plugged Click and Cython in my talk, too, and discussed them afterwards. I found tons of interest in Python at FOSS4G and lots of good ideas about how to use it.

I confess that I didn't pay a lot of attention to the talk schedule before the conference. My summer was kind of nuts and I don't subscribe to any OSGeo lists. When I did look closely I was surprised to find that many people were giving two talks and some three. If any woman or first-timer didn't get a chance to speak while some dude got three (and the multiple talkers were all men and long time attendees as far as I can tell) – that's a bug in the talk selection that needs to be fixed before the next edition.

Lastly, I think the views of Mount Hood you get when flying in and out of PDX to destinations south and east are worth the airfare all by themselves.

https://farm6.staticflickr.com/5587/15249959145_91e47b3444_c_d.jpg

Back from FOSS4G

In my experience, FOSS4G was tons of fun and very well run. Chapeau to the organizing team! I hope other attendees got as much out of the conference as I did. Not only did I get to catch up with people I met at the dawn of FOSS4G, I met great people I'd only known from Twitter and made entirely new acquaintances. I even got to speak a bit of French.

My talk was one of the first in the general sessions. I had fun presenting and am told that I did a good job. My slides are published at http://sgillies.github.io/foss4g-2014-fiona-rasterio/ and you can fork them from GitHub. According to the information at the FOSS4G Live Stream page all the talks will be available online soon. I missed plenty that I'm looking forward to seeing on my computer. Out of the ones I attended, I particularly recommend seeing the following:

  • "Using OpenStreetMap Infrastructure to Collect Data for our National Parks" by James McAndrew, National Park Service

  • "Managing public data on GitHub: Pay no attention to that git behind the curtain" by Landon Reed, Atlanta Regional Commission

  • "Big (enough) data and strategies for distributed geoprocessing" by Robin Kraft, World Resources Institute

  • "An Automated, Open Source Pipeline for Mass Production of 2 m/px DEMs from Commercial Stereo Imagery" by David Shean, University of Washington

Did the code of conduct work? I heard one speaker invoke images of barely competent moms – "so easy your mother can do it" – and was present for a unfortunate reference to hacking private photos at lunch time. I hope that was all of it.

If you attended FOSS4G or watched the live feed I encourage you to write about your experience and impressions. Come on, do it. It doesn't have to be long or comprehensive. Here are a few blog posts I've seen already:

Fiona and Rasterio releases

Like everyone else, I'm making releases before FOSS4G. Fiona 1.2 has a bunch of bug fixes and new features (contributed largely by René Buffat) and Rasterio 0.12 has new CLI commands and options. I'll be talking about these packages and their design and use first thing Wednesday morning (September 10) at FOSS4G. I've also got some things to say about Python programming and geographic data that are not specific to Fiona and Rasterio.

The big deal, however, will be the release of Shapely 1.4 on September 9. This is the first version with major new features since the project made the jump to Python 3. There will be quite a lot of new stuff in 1.4 including better interaction with IPython Notebooks, vectorized functions, an R-tree, and lots of speedups. It's been a group effort largely motivated by development of visualization and analytic frameworks: Cartopy and GeoPandas. Joshua Arnott and Jacob Wasserman in particular have been putting a lot of time into making Shapely better and faster over the past couple of weeks. If you're a Shapely user, please do something nice for these two the next time you see them.