Diving into Shapely 1.1

Update (2009-09-16)

We're changing the future release version for this work from 1.1 to 2.0 to make space for smaller scope changes in the interim. See http://sgillies.net/blog/934/updates-to-the-shapely-roadmap.

The code that will become Shapely 1.1 is mostly in working (but not by any means production ready) order and ready for testing by folks that are interested in prepared geometries or new plugins. Documentation and doctests are not up to date, though the unit tests are largely passing, and with good coverage of most modules. It's not been tested on Windows at all, but might work with some coaxing.

Coordinates now are stored primarily as Python arrays of floats. GEOS geometries are created as needed and cached, with some positive and negative effects on performance. Python access to coordinates is much more efficient than it has been. Chaining of operations is less efficient because we're unnecessarily creating Python arrays for anonymous geometries. For example, the following code

p = Point(0.0, 0.0)
b = p.buffer(10.0)
x = list(b.exterior.coords)

benchmarks like this (in microseconds):

1.1a1:   873 usec/pass
1.0.12:  957 usec/pass

but this code, with operation chaining

p = Point(0.0, 0.0)
a = p.buffer(10.0).exterior.convex_hull.area


1.1a1:  1931 usec/pass
1.0.12:  271 usec/pass

I have some ideas about how not to load Python arrays in the case of anonymous geometries that should address this issue.

Shapely 1.1 has a new architecture. Most users won't even notice, but methods of geometries now call on entry points of plugin packages. The default plugin package is shapely.geos, but you can switch via a function in shapely.implementation as you need.

There are a few known bugs and changes to look out for:

  • Adaptation of the geo interface is broken (no asShape()).
  • Heterogeneous geometry collections are broken.
  • Prepared geometry module shapely.prepared needs to be moved to shapely.geos.prepared (only available if you have GEOS 3.1 anyhow).
  • We're now expecting coordinates to come from Numpy as arrays of x's, y's, and z's which means that Shapely integrates with Numpy in the same manner as matplotlib.

To dive in and try it out (with GEOS 3.0+), I suggest you make a fresh Python 2.5 virtualenv and easy_install the shapely.geos package

$ virtualenv --no-site-packages try-shapely
$ cd try-shapely
$ source bin/activate
(try-shapely)$ easy_install -i http://gispython.org/dist/index shapely.geos

Being a dependency of shapely.geos, Shapely is automatically fetched. If you'd like to hack on them (patches welcome!), check the code out and install them into the virtualenv in development mode

(try-shapely)$ mkdir src; cd src
(try-shapely)$ svn co http://svn.gispython.org/svn/gispy/Shapely/trunk Shapely
(try-shapely)$ cd Shapely
(try-shapely)$ python setup.py develop
(try-shapely)$ cd ..
(try-shapely)$ svn co http://svn.gispython.org/svn/gispy/shapely.geos/trunk shapely-geos
(try-shapely)$ cd shapely-geos
(try-shapely)$ python setup.py develop
(try-shapely)$ cd ../..

Despite all the changes inside, the interface (with a few exceptions) is the same as 1.0:

(try-shapely)$ python
Python 2.5.2 (r252:60911, Dec 23 2008, 09:29:43)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from shapely.geometry import Point
>>> p = Point(0.0, 0.0)
>>> p.wkt
'POINT (0.0000000000000000 0.0000000000000000)'