2010 (old posts, page 7)

In Rtree news

This morning Howard Butler announced an upcoming Spatialindex 1.6.0 release: http://lists.gispython.org/pipermail/spatialindex/2010-September/000232.html, to be followed by an Rtree release. I'm becoming more interested in seeing libspatialindex currently packaged for Linux distributions, particularly Ubuntu. Lucid has 1.4.0. I wonder if I can parlay co-working with a Launchpad developer into a fast track for 1.6.

After a long time away I'm using Rtree (0.6) in anger this week. I love what Howard and Brent Pederson have done with the code and docs. Try out

>>> from rtree import Rtree
>>> help(Rtree)

and you'll see what I mean. I've also uploaded new HTML docs (by Howard, using Sphinx). Let us know if you find any errors or unclear directions.

My own Rtree-using project is a simple, batching index for GeoJSON feature-like objects. Deserialized GeoJSON goes in, deserialized GeoJSON comes out. The interface is shown by this base class [indexing/__init__.py#L45]:

class BaseIndex(object):
    """Base class for indexes

    Deals in GeoJSON-like dicts we will call ``items``, like:

      {'id': 'db441f41-ec34-40fb-9f62-28b5a9f2f0e5',    # required
       'bbox': (0.0, 0.0, 1.0, 1.0),                    # recommended
       'geometry':                                      # required
         {'type': 'LineString', 'coordinates': ((0.0, 0.0), (1.0, 1.0))},
       'properties': { ... }                            # optional
       ... }

    """
    def intid(self, item):
        """Return a unique integer id for the item"""
        raise NotImplementedError
    def bbox(self, item):
        """Return a (minx, miny, maxx, maxy) tuple for the item"""
        return bbox(item)
    def intersection(self, bbox):
        """Return an iterator over items that intersect with the bbox"""
        raise NotImplementedError
    def nearest(self, bbox, limit=0):
        """Return an iterator over the nearest N=limit items to the bbox"""
        raise NotImplementedError
    def index_item(self, itemid, bbox, item):
        """Index item using unique integer itemid and bbox as key"""
        raise NotImplementedError
    def unindex_item(self, itemid, bbox):
        """Unindex the item with (itemid, bbox) as key"""
        raise NotImplementedError

Some of you (2 people, tops) will recognize this from zope.catalog. My first implementation is an index that uses an instance of Rtree as a forward mapping and an instance of OOBTree.OOBTree (from the ZODB package) as a backward mapping (item values and id/bbox keys) [vrtree/__init__.py#L10].

class VRtreeIndex(BaseIndex):
    """An index with an R-tree as the forward mapping and a B-tree as the
    backward mapping

    Modeled after the example at http://docs.zope.org/zope.catalog/index.html.
    """

    def clear(self):
        self.fwd = Rtree()
        self.bwd = OOBTree.OOBTree()
    def __init__(self):
        self.clear()
    def intid(self, item):
        """Rtree requires unsigned long ids. Python's hash() yields signed
        ints, so we add 2**32 to hashed values for the former."""
        return hash(item['id']) + OFFSET
    def intersection(self, bbox):
        """Return an iterator over Items that intersect with the bbox"""
        for hit in self.fwd.intersection(bbox, True):
            yield self.bwd[(hit.id, tuple(hit.bbox))]
    def nearest(self, bbox, limit=1):
        """Return an iterator over the nearest N=limit Items to the bbox"""
        for hit in self.fwd.nearest(bbox, limit, True):
            yield self.bwd[(hit.id, tuple(hit.bbox))]
    def index_item(self, itemid, bbox, item):
        """Add an Item to the index"""
        if (itemid, bbox) in self.bwd:
            self.unindex_item(itemid, bbox)
        self.fwd.add(itemid, bbox)
        self.bwd[(itemid, bbox)] = item
    def unindex_item(self, itemid, bbox):
        """Remove an Item from the index"""
        value = self.bwd.get((itemid, bbox))
        if value is None:
            return
        del self.bwd[(itemid, bbox)]
        self.fwd.delete(itemid, bbox)

I think this is the first time I've ever written code using a tuple as a mapping key, but it's a perfect fit for an R-tree that takes a tuple of integer id and bounding box as a key. This index will be given a persistence wrapper and fronted by a Tornado based service that accepts JSON data and dispatches to index methods. A "Spatial Solr", if you will, which I attempt to describe in the README.

That's not the only possible implementation. One that needed to run on App Engine (for example), where there's no C-dependent Rtree, might use the App Engine datastore for backward mappings, and load an instance of pyrtree.RTree from the former on startup; the mapping serving as persistence for an R-Tree that doesn't have its own persistence mechanism.

Naming projects gets harder and harder. "Vaytrou" is a verlan play on the verb trouver (to find). "Vétrou" would be more authentic, but GitHub refused.

Update (2010-09-17): Here's the Vaytrou server. Simple. Handlers to accept batches of items and perform queries are in the works.

Update (2010-09-17): Spatialindex 1.6.0 is now available from links at http://trac.gispython.org/spatialindex/wiki/Releases.

Comments

Re: In Rtree news

Author: René-Luc D'HONT

Now, I understand the 'Vaytrou' name!!! I initially think about 'way true'!!! I'd like to work with Tornado, if it's possible!

Re: In Rtree news

Author: Sean

Have you forked Vaytrou yet? It would be fun to work with you on it.

Re: In Rtree news

Author: whit

Nice to see the standalone http rtree index resurface! and yeah, I did recognize it as zope.catalog. Might take a stab at running it with gevent.

Shapely 1.2.4

Shapely 1.2.4 has been tagged (see http://github.com/sgillies/shapely/downloads). Sdists are available from http://gispython.org/dist and http://pypi.python.org/pypi/Shapely/1.2.4. This releases fixes bugs involved with loading the libgeos_c DLL/SO and generally improves the user experience in the case of not being able to load them, if you can believe that. If you try to import from Shapely without GEOS on your system, you get:

>>> from shapely.geometry import Point
Traceback (most recent call last):
...
OSError: Could not find library geos_c or load any of its variants

Also in 1.2.4, AttributeError is raised when you call a GEOS-backed method that isn't supported by your OS's geos_c version. In 1.2.3, you'd see IndexError if you called either of the linear interpolations methods with GEOS < 3.1. Exceptions are part of the API, too, and these changes make the API a bit more user-friendly.

Comments

No available geos

Author: Reinout van Rees

Not having geos available isn't so strange. I normally work on linux, but had to install a (geo) django website on windows a few weeks ago. In combination with oracle.

Problem: I couldn't find a geos .exe installer for windows. In the end I had to install postgres+postgis to get my hands on a geos dll ;-)

Re: Shapely 1.2.4

Author: Sean

I saw that you'd blogged about that here and I forwarded it along to people on Twitter. Shapely's Win32 installers include the GEOS DLL, so Windows users don't have this problem, but it's possible to install Shapely on *nix systems without the library. In the previous version, it would fail in that case, but with cryptic errors.

At the market

While the year-round market on the Boulevard des Arceaux in Montpellier is the best I've ever frequented, the Larimer County Farmers' Market is a pretty good, if short, substitute. Here are some sights you don't see at a market in the Languedoc:

http://farm5.static.flickr.com/4131/4975285539_cf78fc3051_z_d.jpg

French people who've tasted sweet corn love it, but it's just not grown in the south of France like it is in the USA's Midwest. This man here had corn, only corn. At Arceaux it was common to see a vendor of oysters only or exclusively honey, but vegetable sellers almost always had some range. An exception was aspargus: in season, you could bring nothing but boxes of asparagus to market and rake in the euros.

One of the sounds of the Fort Collins market in September is the roar of propane jets:

http://farm5.static.flickr.com/4130/4975896300_747b3cb7d1_z_d.jpg

The smell of freshly roasted green Anaheims, Poblanos, and Big Jims pervades the market atmosphere. I grab a couple bags every Saturday, peel them immediately, and toss them in the deep freeze for winter stews.

We like our melons big in the USA. Like the chiles, these are typical of Southeastern Colorado.

http://farm5.static.flickr.com/4103/4975280581_58bc3a7df1_z_d.jpg

Colorado watermelon tops French pasteque, but I prefer the smaller orange Lunel melons to the bloated Rocky Ford cantaloupes. These ripen later around here, which meant we left France at the end of melon season and arrived in Colorado at the beginning of melon season. It's been the same situation for peaches:

http://farm5.static.flickr.com/4085/4975287853_cb4d50bf6b_z_d.jpg

Peaches from Colorado's Western Slope are unrivaled.

Comments

Re: At the market

Author: Tyler

An interesting website for farmers markets is

RealTimeFarms

which allows venders to post lists and pictures of what they are selling. It also show connections between restaurants serving local foods and the farms that produce it. It was started in Ann Arbor and most of the current content is for the greater Detroit area, but the founder has told me that Colorado may be one of the next regions to expand to.

Why not GeoJSON?

I've seen people question ESRI's decision not to use GeoJSON in their server API spec (note: sometime after I wrote "If it's a #GIS spec, you can be sure there's a click-through agreement to read it" the click-through was thankfully eliminated). Since we in the GeoJSON working group haven't gotten around to actually standardizing the format through the IETF (for example, the body that published RFC 4627 on JSON), it's completely forgivable. The CC-BY license on the GeoJSON spec is clear on rights to use the work itself, but perhaps not clear enough to some about the rights to "GeoJSON technology". Myself, I don't consider GeoJSON a technology, but it's courts that decide, not me. One must also remember that ESRI's spec is the documentation of a web API that, while maybe not written before the GeoJSON working group started, shipped before we had a published GeoJSON spec and well before it got serious traction.

I do like a couple aspects of ESRI's JSON geometry representations. Geometry objects aren't primarily determined by their "type", but by their properties: an object with an "x" and a "y" is a point, an object with "paths" is a polyline, an object with "rings" is a polygon. We considered this "duck typing" approach when drafting the GeoJSON spec, but explicit typing won out instead. In addition, with the exception of point objects, the coordinates are represented in a way that's entirely compatible with a GeoJSON reader. It's a shame about the points, but otherwise there's a lot of common ground. ESRI JSON features look a lot like GeoJSON features, too.

I'm not sure what to say about the rest of the spec. It's awfully large and I get a flash of OOXML deja vu that may yet pass. It's certainly different. I'm looking forward to other reviews and analysis.

Browsing spatially referenced Djatoka images with OpenLayers

A while back my colleague Hugh Cayless wrote a Djatoka layer for use with OpenLayers. It is deployed at UNC's library, allowing you to view over 3000 North Carolina Maps. View the source on any map and you'll quickly get a sense of how it's rigged up. You pass a URL encoding requests for image regions and a URL encoding requests for image metadata to the OpenURL layer constructor, it figures out the grid and zoom levels for you, and you add the layer to a map that's configured using the layer's resolution, maximum extent, and tile size parameters. It works much like a TMS layer, but in pixel units.

I began with this code for a different project and modified it to satisfy some of my project's special needs:

  • The imagery of a physical object and its annotations will be shown in their natural units – centimeters.
  • Square tiles will be used instead of rectangular tiles with the same aspect ratio as the physical object (approximately 13:1).
  • Browser and image server are in different domains, complicating discovery of image metadata.

I decided to go with Djatoka instead of an open source GIS map server like GeoServer or MapServer because the project, while spatial in nature, doesn't require cartographic projections or spatial indexing; I was interested in trying something new; and NYU's already got one running (Thanks, Hugh!). I ended up modifying Hugh and Cliff's OpenURL.js considerably. Now, it's more like a hybrid between OpenLayers.Layer.Image and OpenLayers.Layer.WMS. The code is at http://github.com/sgillies/tpeutb/blob/master/OpenURL.js.

Unlike a GIS map server, Djatoka doesn't pad requests: it fails if you ask for regions outside the image. To have square pixels, I've had to pad my imagery a bit. It's now 43008 pixels wide by 3072 pixels high, exactly 14:1. This makes the right tile size 192 x 192 pixels. As things stand, my new OpenURL layer can't be used in the same map as Hugh's, so I haven't bothered to rename it yet. It's used like this:

var RES = 1.0/60.3; // Centimeters per pixel
var size = new OpenLayers.Size(43008, 3072);
var origin = new OpenLayers.LonLat(-16.6222, -1.2174);
var extent = new OpenLayers.Bounds(
    origin.lon, origin.lat, origin.lon + size.w*RES, origin.lat + size.h*RES);

// Zoom factors of 1/8 to 4
var resolutions = [8.0*RES, 4.0*RES, 2.0*RES, 1.0*RES, 0.5*RES, 0.25*RES];

// At zoom level 0, the image is covered by 2 x 28 tiles
var tileSize = new OpenLayers.Size(192, 192);

var mosaic = new OpenLayers.Layer.OpenURL(
  'All section, mosaicked',
  'http://dl-img.home.nyu.edu/',
  'http://pipsqueak.atlantides.org/mosaic.tif', // for example, 404 in reality
  extent,
  size,
  { format: 'image/jpeg',
    units: 'cm',
    tileSize: tileSize,
    resolutions: resolutions }
  );

var map = new OpenLayers.Map(
  'map',
  { units: 'cm',
    tileSize: tileSize,
    resolutions: resolutions }
  );

A very simple application of this layer can be seen at http://pipsqueak.atlantides.org/tpeutv/djatoka.html.

Shapely 1.2.3

Shapely's had a shape() function that makes geometric objects from GeoJSON-like mappings for some time. Now, by popular request, is the inverse: mapping():

>>> from shapely.geometry import Point, mapping
>>> m = mapping(Point(0, 0))
>>> m['type']
'Point'
>>> m['coordinates']
(0.0, 0.0)

The new function operates on any object that provides __geo_interface__ [link]. This release also fixes a major bug involving GEOS versions < 3.1 introduced in 1.2.2. You can download an sdist from http://pypi.python.org/pypi/Shapely/. Windows installers should be coming soon.

As I like to remind people every few releases, major portions of this work were supported by a grant (for Pleiades) from the U.S. National Endowment for the Humanities (http://www.neh.gov).

Update (2010-08-22): Win32 and Win-AMD64 installers have been uploaded to the locations above.

There and back again

We're back in Fort Fun. After an initial snafu the travel gods smiled on us and the flights and connections went more smoothly than we had expected. There was a bit of airsickness and our poussette went missing, but we got some sleep on the long CDG to ATL leg, Customs didn't balk at the number of wine bottles we brought in our luggage, and we arrived not too burned out to celebrate with take-out pizza and a St. Lupulin. The kids even slept until five o'clock this morning. I'm declaring it a win.

The help and hospitality of many people made our séjour an amazing experience. Thank you, everyone, and particularly our nounou, Célia Mourrut; our hosts and friends at CBGP, Denis Bourguet, Arnaud Estoup, Benoît Facon, Renaud Vitalis, and Miguel Navascues; our friend across the street at the USDA, René Sforza; in Heidelberg the Epigraphische Datenbank Heidelberg team: Francisca Feraudi-Gruénais, Brigitte Graef, and James Cowey; in Montpellier, René-Luc D'Hont (3LIZ) and Gerald Fenoy (Zoo Project), and Vincent Heurteaux, Adrian Custer, and Martin Deruisseaux from Geomatys; in Torino, Giorgio Borelli and Silvio Tomatis (collective.geo and more), and Stefano Costa (Open Archaeology); in Isère (the Department of), Eric Lemoine. I'm looking forward to returning the favor some day.

Next steps are making some fresh salsa (I could never find ripe tomatoes, onion, cilantro, and chiles simultaneously at Montpellier's Arceaux market), unpacking, and getting some fall season vegetables planted.

Bye bye

Waiting at the Louis Blanc tram stop after probably our last family dinner in the centre ville – at Le Roule ma Poule in the relatively sleepy north side of the Ecusson – we saw a guy using the tramway as bike lane.

http://farm5.static.flickr.com/4084/4831955732_c82c97d439_z_d.jpg

If you keep an eye on the board at every stop and time it right, you can roll along unimpeded.

http://farm5.static.flickr.com/4109/4831955746_74941e032a_z_d.jpg

Into the sunset.

http://farm5.static.flickr.com/4084/4831955764_6aee696e5f_z_d.jpg

Au revoir, Montpellier.

Comments

Re: Bye bye

Author: ReLuc

Bye! Good trip to country home and next time in USA ;-)

Shapely recipes

I'm committed to keeping the Shapely API small. If it were a Swiss Army knife, it would be a slim one that fits comfortably in your hand and doesn't require you to wear a belt to keep your pants up when you pocket it. This means saying no to features. I recently read a person's opinion that Shapely should support SRIDs – like PostGIS does, I assume the reasoning to be. The GEOS library does in fact implement spatial reference system identification numbers for geometric objects, but I'm leaving this out of Shapely. Such identification numbers are relevant only to OGC standard (SFSQL) databases. On the web we identify coordinate systems by URIs. In a Python application, a spatial reference system should be a Python object not an integer pointing to some object outside the environment. A premise of Shapely is that if your application really needs to deal with transformations between coordinate systems, you'll know best how to add back this missing complexity. What's actually missing from Shapely is not the SRID concept, but a recipe or two for how to add in that concept.

Another requested feature left out is the splitting of lines. Inclusion of GEOS's linear referencing in the Shapely core barely made the cut (I think it could have been a good example of an extension), but makes it possible to implement line cutting algorithms in not very many lines of code. A naive one is shown below.

from shapely.geometry import LineString, Point

def cut(line, distance):
    # Cuts a line in two at a distance from its starting point
    if distance <= 0.0 or distance >= line.length:
        return [LineString(line)]
    coords = list(line.coords)
    for i, p in enumerate(coords):
        pd = line.project(Point(p))
        if pd == distance:
            return [
                LineString(coords[:i+1]),
                LineString(coords[i:])]
        if pd > distance:
            cp = line.interpolate(distance)
            return [
                LineString(coords[:i] + [(cp.x, cp.y)]),
                LineString([(cp.x, cp.y)] + coords[i:])]

The result (see cut.py):

http://sgillies.net/images/cut.png

This figure was made with matplotlib, of course. Control points of the line are shown as circles, and the boundary points are displayed in black.

I can imagine that different users and different applications will want to chop up lines in grossly or subtly different ways. The above returns a list of LineString, but another application might want a MultiLineString or an iterator that nibbles at the input while reading from another iterator of bite sizes. Who knows? Cutting up a line simply isn't a primitive operation. A recipe for doing this type of operation using Shapely seems better to me than including one implementation in the core.

Geometric object constructors that validate their input have been proposed from time to time. Validation is particular and expensive and better (in my opinion) left to a developer. Maybe your input has already been validated; maybe objects deemed "invalid" by JTS or GEOS are useful to you in some way. Shapely's constructors don't validate and aren't cluttered with validate=True arguments or the consequent compound if/else statement. Disappointed? The good news is that it's easy to add back the complexity of validation. The Shapely manual includes a recipe for a validating decorator.

from functools import wraps

from shapely.geos import TopologicalError

def validate(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        ob = func(*args, **kwargs)
        if not ob.is_valid:
            raise TopologicalError(
                "Given arguments do not determine a valid geometric object")
        return ob
    return wrapper

The decorator could be used like so:

>>> @validate
... def ring(coordinates):
...     return LinearRing(coordinates)
...
>>> coords = [(0, 0), (1, 1), (1, -1), (0, 1)]
>>> ring(coords)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in wrapper
shapely.geos.TopologicalError: Given arguments do not determine a valid geometric object

Not everyone is fond of decorators and might want to do this differently. Subclass the geometry types, write some Zope-style adapters, or monkey-patch asShape – it's your call. If you will need validation, you will know (or discover) best how to exploit Shapely to implement it.

By keeping Shapely light and simple I expect we can keep it easy to use and extend. Indulge me in a food analogy, if you will. Shapely isn't ketchup, it's the ripe tomato: raw, but more useful in a wider range of recipes.

HTTP FTW

Waouh (scroll down, no internal anchors), this is a big deal to me:

  • OGC TC directs the OGC-NA that all new OGC identifiers issued for persistent public OGC resources shall be http URIs, instead of URNs
  • New standards and new major versions of existing standards shall use http URIs for persistent public OGC resources to replace OGC URN identifiers defined in previous standards and versions, unless OGC-NA approves an exception

Use of HTTP URIs is axiomatic for "linked data". This begins to make OGC stuff relevant to the web.

Update (2010-07-08): More on Allan Doyle's blog.