2008

How to decorate Python GIS code

Last month I blogged about Python logging and how to avoid using print statements in geoprocessing code. But your crufty old code isn't going to rewrite itself, and you're overworked already. An efficient fix would be optimal, and I've got one that only requires a little time to learn how to use Python decorators.

Say you have a module and function that does some geoprocessing work and prints various messages along the way. Something like this:

def work():
    print "Starting some work."
    print "Doing some work ..."
    print "Finished the work ..."

if __name__ == "__main__":
    work()

which, when run, produces output in your terminal.

$ python work.py
Starting some work.
Doing some work ...
Finished the work ...

Now, your function is much more gnarly than work(), and rewriting it will only sap your goodwill toward its author. You'd think it would be possible to somehow wrap the work() function, catching those print statements and redirecting them to a logger – while not breaking code that calls work() – all in a reusable fashion. And it is possible, using a decorator like the 'logprints' class in the code below:

import logging
from StringIO import StringIO
import sys

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s %(levelname)s %(message)s',
    filename='work.log',
    filemode='w'
    )

class logprints(object):

    def __init__(self, func):
        # Called when function is decorated
        self.func = func

    def __call__(self, *args, **kwargs):
        # Called when decorated function is called

        # save reference to stdout
        saved = sys.stdout

        # make a string buffer and redirect stdout
        net = StringIO()
        sys.stdout = net

        # call original function
        retval = self.func(*args, **kwargs)

        # restore stdout
        sys.stdout = saved

        # read captured lines and log them
        net.seek(0)
        for line in net.readlines():
            logging.info(line.rstrip())

        # return original function's return value(s)
        return retval

@logprints
def work():
    print "Starting some work."
    print "Doing some work ..."
    print "Finished the work ..."

if __name__ == "__main__":
    work()

The statement "@logprints" is interpreted as "decorate the immediately following function with the 'logprints' class." On import of this module, the method logprints.__init__() is called with 'work' as the sole argument. Afterwards, when work() is interpreted, logprints.__call__() is called. That method acts as a proxy for the original, now decorated, function. Here is the print capturing and logging decorator in action:

$ python work2.py
$ cat work.log
2008-12-30 12:14:44,044 INFO Starting some work.
2008-12-30 12:14:44,044 INFO Doing some work ...
2008-12-30 12:14:44,044 INFO Finished the work ...

Yes, you could have redirected the output of the original script in the terminal, but remember that Python's logging module sets you up to do much more.

I've recently learned how to use parameterized decorators by following the examples in Bruce Eckel's article. I'm using one to deprecate functions in Shapely:

import warnings

class deprecated(object):

    """Mark a function deprecated.
    """

    def __init__(self, version="'unknown'"):
        self.version = version
        self.msg_tmpl = "Call to deprecated function '%s', to be removed in version %s"

    def __call__(self, func):
        def wrapping(*args, **kwargs):
            warnings.warn(self.msg_tmpl % (func.__name__, self.version),
                          DeprecationWarning,
                          stacklevel=2
                          )

            return func(*args, **kwargs)
        wrapping.__name__ = func.__name__
        wrapping.__doc__ = func.__doc__
        wrapping.__dict__.update(func.__dict__)
        return wrapping

Marking a function deprecated like:

>>> from shapely.deprecation import deprecated
>>> @deprecated(version="1.1")
... def foo():
...     return None
...

causes a warning to be emitted when the function is called:

>>> foo()
/Users/seang/code/gispy-lab/bin/labpy:1:
DeprecationWarning: Call to deprecated function 'foo',
to be removed in version 1.1

Deprecation-marking decorators are a great solution (which I first saw used, in a different form, in Zope 3). Why would you want to rewrite a function that's going away in the next software version?

Decorators can also be chained. In Shapely I've factored the check for non-nullness of GEOS geometries into a decorator and chain it with the built-in property decorator:

@property
@exceptNull
def geoms(self):
    return GeometrySequence(self, LineString)

To this effect:

>>> from shapely.geometry import MultiPoint
>>> m = MultiPoint()
>>> m.geoms
Traceback (most recent call last):
...
ValueError: Null geometry supports no operations

The exception is raised by the 'exceptNull' decorator.

Not much specifically about GIS here, I'll admit, but GIS programming in Python is, or should be, just Python programming. Feel free to comment if you see any interesting applications of decorators.

Comments

Re: How to decorate Python GIS code

Author: brentp

nice logging decorator. i believe it's also good form to use the decorator module: http://pypi.python.org/pypi/decorator to reduce boiler-plate. and it gives you decorator.decorator to decorate your decorators.

ESRI users discover setuptools and easy_install

My work is done. Or, at least, the part of my work not involved with deprogramming OGC web services cult members. And the part of my work not involved with tooting my own horn. For example, check out this blog post from 2005 (2005!) on emailing Python script errors. Prescient, huh? Too bad I didn't write "or send a message not exceeding 140 characters -- a 'tweet', so to speak -- to your 'followers'." instead of "or ping your enterprise's paging system." If I had done that, you'd never hear the end of it.

I can has Python and GIS environments?

I've spent this short week tuning up my new laptop's development environment, and a side effect of this work is a new build system for replicable, isolated Python, GIS, and image/raster processing environments. Ichpage replaces Gdawg on my machine. It supplies:

  • GDAL (osgeo.gdal, etc)
  • geojson
  • geopy
  • keytree
  • lxml
  • Numpy
  • PIL
  • PyProj
  • Rtree
  • Shapely

and their various library dependencies (libgdal, libgeos_c, libspatialindex, libxml2, libxslt). To get started, clone or get the tarball, cd into the directory, and execute:

$ virtualenv .
$ source ./bin/activate
(ichpage)$ python bootstrap.py
(ichpage)$ buildout
(ichpage)$ . ./setenv
(ichpage)$ labpy
>>> from osgeo import gdal
>>> from shapely.geometry import Point
...

To do tasks include linking the GDAL utilities into the environment's bin directory, adding WorldMill, perhaps adding matplotlib. For now, it's a way for me to manage C libs while I develop Shapely and Rtree, and perhaps useful to other geospatial Python developers.

Comments

Re: I can has Python and GIS environments?

Author: Kurt

keytree?

Re: I can has Python and GIS environments?

Author: Sean

Keytree is a little KML helper for use with Python ElementTree APIs.

Preserving first-generation web/GIS projects

Check out this interesting article about the reanimation of an orphaned plant database and its associated ArcIMS instance. The analysis of the issues is sound. I disagree, of course, with their conclusion that ArcIMS is something worth learning and deploying in 2008, and this raises in my mind another issue that the authors did not identify: is not the project's data and its provenance the thing that is most important to preserve? Must the interface cruft around it be preserved in anything other than an archived form, if at all? The ArcIMS user interface and the species database browser are no kind of programmable web APIs; it's unlikely any other application would be broken by a switch to some free web mapping framework or modern search interface.

Now there's a question: switch to what? If this story is just the beginning, and bigger boxes of used, discarded, but potentially useful first-generation web/GIS projects end up in the laps of librarians, a turnkey (and open source, naturally) ArcIMS to MapServer/MapGuide migration tool might be a handy thing. I wouldn't be surprised if such a thing existed. Its authors might want to consider pitching it to GIS librarians in higher education.

Comments

Re: Preserving first-generation web/GIS projects

Author: Jason Birch

That's just freaking bizarre. I was lying in bed last night thinking about whether it would be hard to write an AXL to MapGuide XML transformation.

Re: Preserving first-generation web/GIS projects

Author: James Fee

Wow, deploy ArcIMS in 2008/2009. This is why ESRI can't kill ArcIMS, folks still want to use the darn thing. As much as Jack can get up on stage and basically say the thing is depreciated, people still refuse to listen. Of course part of the problem is ESRI is willing to continue selling licenses to people to deploy it, but I suppose in a higher ed setting site licenses abound and they could have easily gone with the ESRI RESTful API if they wanted to stay ESRI.

Geojson 1.0.1

Geojson 1.0.1 fixes a bug in serialization of features with no geometry.

Comments

Re: Geojson 1.0.1

Author: Stefano Costa

Sean, I'm trying to update, but:
steko@gibreel:~$ sudo easy_install -U geojson
Searching for geojson
Reading http://pypi.python.org/simple/geojson/
Reading http://trac.gispython.org/lab/wiki/GeoJSON
Reading http://trac.gispython.org/projects/PCL/wiki/GeoJSON
Best match: geojson 1.0.1
Downloading http://pypi.python.org/packages/source/g/geojson/geojson-1.0.1.tar.gz#md5=c594cd40085987eafec38f457ff8db49
Processing geojson-1.0.1.tar.gz
Running geojson-1.0.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-tDg03o/geojson-1.0.1/egg-dist-tmp-Y-zuud
error: VERSION.txt: No such file or directory
On Debian Sid, not sure if it's a platform-specific problem or a bug in the package.

Re: Geojson 1.0.1

Author: Sean

Stefano, I made the release yesterday from my new Mac, and the sdist was truly broken. I've uploaded a new sdist (made on the same Linux box as 1.0.0) that appears to be fine.

Re: Geojson 1.0.1

Author: Stefano Costa

The current sdist works fine, thanks. BTW, it's just too bad that easy_install doesn't support removing an installed egg. Of course the best option would be to have some apt-gettable packages in DebianGIS, but one thing at a time...

Re: Geojson 1.0.1

Author: Sean

I agree with you about uninstalling. Ian Bicking's pip aims to address that. I disagree about Debian packages being the best option. I fully support Python's goal of being able to distribute and install its own packages.

The return of the scientist

While I was fiddling with a related blog post, word came out about Obama's appointment of John Holdren to the post of White House science advisor. Holdren's Boston Globe op-ed from the summer is now a must read:

The few climate-change "skeptics" with any sort of scientific credentials continue to receive attention in the media out of all proportion to their numbers, their qualifications, or the merit of their arguments. And this muddying of the waters of public discourse is being magnified by the parroting of these arguments by a larger population of amateur skeptics with no scientific credentials at all.

I'm not sure, but I think he might be talking about you, John Christy and Joe Francica. It looks like we're going to start "counting the bears" in earnest. If Francica was taken aback by Obama's comment on the rancid state of Interior, I fear he may need someone to grab him a fainting couch for this.

My new project

http://farm4.static.flickr.com/3152/3108362480_9a020622cc_d.jpg

Born Saturday, a rather surprising 5 days early. A number of my friends and contacts have infants due this season, and I wish that all arrive just as safely and joyfully.

Comments

Congrats Sean!

Author: David Fawcett

Congratulations Sean! David. Mine arrived 6 weeks ago. Lots of joy, not as much sleep...

Re: My new project

Author: Sean

Congratulations to you, David! The baby boom is bigger than I knew.

Re: My new project

Author: Paul Ramsey

Congratulations! Remember, when it comes to children, 1+1 = 11. You'll soon find that the second child is an existence proof that, despite what you might think, the first child did not actually consume every iota of your free time. What a happy Christmas!

Re: My new project

Author: PhilipRiggs

Congratulations! I agree 1+1 = 11! And 1+1+1 = 111, as I found out in August!

Re: My new project

Author: Matt Giger

Congratulations, what a beautiful little girl.

Re: My new project

Author: Jachym

Congrats! We do follow our time table too ;-)

Re: My new project

Author: Jeroen Ticheler

Congratulations to you all!

Re: My new project

Author: Mateusz Loskot

Congratulations! Now, I feel I'm a bit behind my schedule ;-)

Re: My new project

Author: Dom

Congratulations Sean! I'm sure you'll have a very happy/busy Christmas this year!

Re: My new project

Author: andrea

Congratulations, she has a nice interface ;-)

Re: My new project

Author: Bryan

... and my congrats too!

Re: My new project

Author: Matt

Awesome! I second Paul--you don't really have kids unless you have more than one. Take care of the mother too!

Re: My new project

Author: Rob

Congrats Sean!

Re: My new project

Author: Normand Savard

Congratulations to both of you!

Re: My new project

Author: Sean

Thank you, all.

Re: My new project

Author: Guillaume

Welcome to that world, little pythonista ! Mine is awaited for end of January... Best wishes Sean,

Re: My new project

Author: Jeremy Cothran

Congrats! Watch them grow, they'll learn much more than we'll ever know :)

Re: My new project

Author: Allan Doyle

Having blown past 1, 11, and 111, I know a cute baby when I see one. This one's way up there. Congrats!!

Semantic web at CAA 2009

There will be a semantic web session at the 37th annual international conference on Computer Applications and Quantitative Methods in Archaeology, 22-26 March 2009 in Williamsburg, VA [CAA 2009]:

The Semantic Web: 2nd Generation Applications

Chairs: Leif Isaksen, University of Southampton, United Kingdom, and Tom Elliott, Institute for the Study of the Ancient World, New York University, USA

Abstract:

Semantic Web technologies are increasingly touted as a potential solution to the data integration and silo problems which are ever more prevalent in digital archaeology. On other hand, there is still much work to be done establishing best practices and useful tools. Now that a number of projects have been undertaken by interdisciplinary partnerships with Computer Science departments, it is time to start drawing together the lessons learned from them in order to begin creating second generation applications. These are likely to move away from (or at least complement) the monolithic and large-scale 'semanticization' projects more appropriate to the museums community. In their place we will need light-weight and adaptable methodologies more suited to the time and cash-poor realities of contemporary archaeology.

See Leif's post to the Antiquist group for details.

Why not CIDOC CRM at this time

One of my current projects, named Concordia, aims to bootstrap a graph of open, linked data for ancient world studies. Our decision to defer use of properties of the CIDOC Conceptual Reference Model (CRM) is explained in this memo.

In a nutshell: there's no existing web of CRM-linked data, and implementing the standard gives Concordia no near-term wins. Furthermore, mismatches between the CRM and currently published data mandate a level of effort and expense that cannot be borne at this time. Because the Web is an "open world", CRM details can be added in future, as needed.

Update (2008-12-13): I've received some good feedback concerning non-technical issues that keep museum data shut in and will try to write more about that next week.

Comments

Re: Why not CIDOC CRM at this time

Author: Bryan

So, this is the first time I've heard of CIDOC again ... since I found out about it. So, the bottom line is you think it's going nowhere?

Re: Why not CIDOC CRM at this time

Author: Sean

"Going nowhere" is too strong; I'm just saying that a less perfect approach might improve the linked data picture in the near term.