Now, thar be first-class privateerin'. Cap'n Perry dug up the secret of doublin' ye firepower. Matey, if ye not be comin' to Victoria, send me the address of ye home port, and I be rowin' o'er to the Mothership tonight to fetch ye booty.
Building and installing the GDAL/OGR Python bindings can be a bit of a pain in the neck. It's complex software with a lot of associated data. At the Lab, we're using zc.buildout (see pcl.buildout for a specific example) to create replicable, version-controlled Python environments that include GDAL/OGR (as well as MapServer and PostgreSQL + PostGIS).
Another new option for deploying intricate environments is Ian Bicking's virtualenv. From the readme:
virtualenv is a tool to create isolated Python environments.
The basic problem being addressed is one of dependencies and versions, and indirectly permissions. Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into /usr/lib/python2.4/site-packages (or whatever your platform's standard location is), it's easy to end up in a situation where you unintentionally upgrade an application that shouldn't be upgraded.
Or more generally, what if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.
Also, what if you can't install packages into the global site-packages directory? For instance, on a shared host.
In all these cases, virtualenv can help you. It creates an environment that has its own installation directories, that doesn't share libraries with other virtualenv environments (and optionally doesn't use the globally installed libraries either).
Virtualenv covers the same ground as the GDAL project's FWTools, but is more programmable and customizable. It integrates with setuptools, and so getting a fresh Python GIS environment from the package index could be just as easy as:
$ sudo easy_install gdal-env
Of course, someone would have to actually create and upload that gdal-env package. Bicking advises us to switch from workingenv to virtualenv.
I've written to the geo-web-rest group about CouchDB, a distributed, non-relational database, and now Sam Ruby has a Python prototype. Basura uses some elements that I like to blog about: namely JSON and WSGI.
This is more or less the final content and format of the Shapely 1.0 manual. I struggled a bit with the choice of format and writer to use. If it had any math I'd probably have reverted to LaTeX, but I'm dodging the math with links to the Java Topology Suite docs and evasive statements like "just go ask Martin Dr J(ts) Davis". OOWriter's HTML export didn't cut it and there was little support for Python code syntax highlighting. Proprietary software or formats were out of the question. Finally I settled on docutils, restructured text, Pygments, and the Pygments code-blocking version of rst2html in the docutils sandbox (also included in the current Trac). The restructured text source is at manual.txt. It works well and looks sharp thanks to a stylesheet from Dave Kuhlman's odtwriter page.
If you're using Shapely, I'd appreciate your feedback on the manual. It's the last ticket on the Shapely 1.0 milestone.
Many of the initial misgivings about applying Atompub to geospatial problems had to do with uncertainty about totality and partiality of feeds. RFC 5005 is attempting to bring more feed standardization to the internet community.
I've updated Mush to use my feedparser.py enhancements and Shapely 1.0a3. Now it will parse GeoRSS GML, Simple, and W3C geometries of all types (points, lines, polygons) from source feeds. For example, here are the last 10 entries from Christopher Schmidt's FeatureServer demo, pulled through the self-intersection processing resource: feed, map.
Please note that, in the interest of conserving resources and minimizing response times, I've limited the number of entries that Mush will read from any feed to 42.
This work reminds me to comment on Andrew Turner's recent post on security issues around feed aggregation. He writes:
The onus of security is on the application or aggregator that pulled the feed on behalf of the authorized user. But at the same time once the feed has been retrieved, there is no storage of the authorization credentials with the feed itself. It has essentially been stripped of itâs shell of potential privacy and looking at the feed itself you would have no idea if it was supposed to be kept private, and visible only to certain, unknown persons.
What would be nice would be a mechanism to store at least references to permissions and authorization credentials within the feed itself. That way if an application still has the feed, or wishes to store it and re-aggregate it, they can apply the same authorization as the feed originally had.
There's another big issue that Andrew doesn't mention (discussed by Richardson and Ruby in chapter 6 of "RESTful Web Services"): how does the aggregator pass along the user's credentials without caching (with risk of theft) them? Mush doesn't intend to solve this problem at all. I think the onus of privacy remains largely on the original content provider. If you want to make a feed for authorized content, you should strip that feed down to the bare minimum and provide https hrefs to the content itself. If the feed metadata must also remain private, you can encrypt specific elements or even the entire feed.
Finally, feeds should be cached for no more than the duration specified by their origin servers. A feed is just a representation of entities that "live" on the Web, and applications should be pulling new representations from the web rather than relying on silos. Storing feeds indefinitely -- treating GeoRSS like shapefiles -- breaks the Web.
Rebranding Atompub as "Federated Geo-synchronization Services" does nothing for me, but at least it is now on the map, so to speak. Some of the more marketing-minded OGC folks on the geo-web-rest group had me worried that the consortium saw REST as just an 80% solution or an architecture for dummies, but "Federated Geo-synchronization" sounds like something that goes beyond simple mashups.
See http://code.google.com/p/feedparser/issues/detail?id=62. In a nutshell, given feeds like:
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" > <entry> <georss:point>36.9382 31.1732</georss:point> </entry> </feed>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" xmlns:gml="http://www.opengis.net/gml" > <entry> <georss:where> <gml:Point> <gml:pos>36.9382 31.1732</gml:pos> </gml:Point> </georss:where> </entry> </feed>
the entry location, or "where" is parsed out like:
>>> import feedparser >>> feed = feedparser.parse(file) >>> entry = feed.entries >>> entry['where']['type'] 'Point' >>> entry['where']['coordinates'] (31.1732, 36.9382)
Recognize that? It's GeoJSON. Simple points, lines, polygons, boxes, and GML points, linestrings, and polygons can be parsed. Since entry["where"] also provides the Python Geo Interface, you can use it immediately with Shapely:
>>> from shapely.geometry import asShape >>> shape = asShape(entry["where"]) >>> shape <shapely.geometry.point.Point object at ...> >>> shape.x 31.1732 >>> shape.y 36.9382
Y a-t-il des utilisateurs? I'm moving to Montpellier next year and would like to make some contacts in the region.
Shapely is a thin wrapper for libgeos_c. How thin? With help from the 2to3 tool, I ported Shapely to Python 3.0 in less than an hour -- that's how thin. Since distutils and setuptools aren't quite ready to go, you will have to get the code from the lab repository, add it to your path, and import it in place if you want to try it out:
$ svn co http://svn.gispython.org/svn/gispy/Shapely/branches/py3k spy3k $ PYTHONPATH=spy3k /usr/local/py3k/bin/python Python 3.0a1 (py3k, Aug 31 2007, 17:11:43) [GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from shapely.geometry import Point >>> p = Point(0.0, 0.0) >>> p.wkt 'POINT (0.0000000000000000 0.0000000000000000)' >>> p.buffer(10.0) <shapely.geometry.polygon.Polygon object at 0xb7ccddec> >>> p.buffer(10.0).area 313.65484905463717
Yesterday we uploaded Barrington Atlas places from Cyrene (map 38, the region now known as Libya) to Pleiades. The master collection KML (http://www.unc.edu/awmc/pleiades/data/kml/places-all-2007-09-21.kml) now contains over 550 places and is pushing the bounds of manageability. The index which will support fast spatial queries of these places is on the workbench right in front of me. After it's finished, we'll be able to leverage the updating features of Google Earth. In the meanwhile, we have a number of sub-collections, by place type and time period: bridges, temples, ..., archaic, classical, etc.