Where's the book?

Greg Wilson, author of Software Carpentry (blog), asks:

Now, would someone please write “GIS with Python”?

What's my advance?

Seriously, it's a more formidable task than it appears. The Python and GIS landscape is rapidly evolving and segregated. Unlike bioinformatics or natural language processing (with Python programming books listed in Greg's post), GISville is a company town. To most GIS programmers "GIS with Python" means ESRI software scripting now with Python instead of Avenue or AML. If there isn't an ESRI Press book on scripting with Python already, there will be soon, but it'll be bound to proprietary software. Software that's "free" to use in the U's computer lab, but not free to use out in the real world or on the internets. And it won't cover free and open source GIS tools.

The seedy side of GISville could be tackled in an "Open Source GIS with Python", perhaps, but even this is a fragmented place with many competing platforms. The are programmers who script GIS tasks with Python on a MapServer/GDAL + SWIG platform. There are programmers who script GIS tasks with Python on a different Qt + SIP platform. We've even got our own Django-Pylons type divide on the standard Python platform. How would one synthesize the different Python software and practices? I don't see how, but maybe I'm lacking the right perspective.

Update (2009-10-18): See http://groups.google.com/group/python-gis-sig

Comments

Re: Where's the book?

Author: Matthew Perry

Instead of synthesizing all those disparate uses into a typical book, how about a collection of chapters on each, edited together to form a comprehensive overview of the python spatial ecosystem? This way you could have the best authors tackle their primary subject and still provide the big picture.

It could be called "Geospatial Python: Overlapping approaches to spatial programming" :-)

Re: Where's the book?

Author: Eric Wolf

Personally, I'd love to see a Python GDAL/OGR-oriented book that implemented basic geoprocessing. Essentially a "here's how to do geoprocessing without ESRI". It would also be easier to introduce more advanced programming concepts. I've found that GISers and Geographers tend to get stuck in the learning curve when using Python with the arcgisscripting module. The learn to do loops and manage simple variables but frequently have trouble seeing the need for objects and modules.

Ironically, I've found myself temporarily without an ESRI license for the first time in about 10 years. So the project I'm working on will be done in straight Python+OGR. It's actually a little annoying because I'm having to write a slew of computational geometry routines (I know, there are libraries like CGAL, but I like using my own objects).

I'd be game to help write a book...

Re: Where's the book?

Author: Sean

OGR? Here's an example of my perspective problem: I've developed enough loathing for the ogr.py API (since years) that I've written WorldMill and blogged a lot about using it. It prompted Howard to make ogr.py better, but I feel like my Shapely, WorldMill, Rtree stuff remains more usable.

Re: Where's the book?

Author: Sean

I'm on the loosely-coupled "Pylons" side of the divide, if it's not obvious.

Re: Eric

Author: Keith

Eric,

Have you tried Spatialite? It has the core spatial functions - intersect, union, distance, length, centroid, etc. It is so easy to use and has a nice GUI to view the spatial data. The commands are incorporated in SQL. The tutorial is good too.

Keith

Re: Where's the book?

Author: Sean

Keith: Spatialite is a neat data store, but only that. What you use to process data after you take it out of the store (and before you store the results) is the question.

Re: Where's the book?

Author: Peter

Use XSLT!

Re: Where's the book?

Author: Sean

Peter: I like declarative programming, and there are some neat options to combine and balance Python and XSLT in lxml's extension elements, but I imagine that the declarative style works best for geoprocessing in configuring complex pipelines of data processing modules that would be written in Python. Maybe like Paste Script wires up WSGI pipelines.

Re: Where's the book?

Author: Peter

Sean,

There's no doubt (in my mind) the heavy lifting in terms of geoprocessing needs to be done by an indexed datastore, hopefully accessible via spatial XQuery/XPath :). There are good current benefits in terms of simplicity, portability and ownership of your code, to thinking about low-volume, high-fidelity geoprocessing in pipelines, as you say. Some pipeline alternatives for such pipelines may in the future include

xmlsh

, and

XProc

. To provide sufficient performance, a geospatial implementation perspective needs to be baked in, something I would like to see.

Cheers,

Peter

Re: Where's the book?

Author: Keith

Sean, Spatialite just a datastore? I use pysqlite to script and create UserDefinedFunctions (UDF) to extend and reuse functionality. Sqlite is rock solid and fast, with a 2 terabyte database size limit on 32bit machines. All this functionality for free - Transactions, Virtual tables, Spatial indexes, attribute indexes, projections, overlay commands (clip,erase,intersect), aggregation, simplification, routing. For just a data store it sure sounds like a geospatial engine. The datastore aspect is compelling as well. You can store multiple geometry types in the same table. A shapefile will only hold a single geometry type. You can also have multiple geometry columns in a single Sqlite table - various projections, for example or you could store parcels as various geometries - rectangular lot, building footprint, property centroid. The business data is easier to manage in a database too. Gotta love it.

Re: Where's the book?

Author: Sean

Relational databases have a large, but not limitless set of uses. If you're going to process non-relational data such as documents or less structured data, you'll need a general purpose programming language and environment. For example, you're certainly using a general purpose language already (maybe even Python) to model data so that it can be loaded into your database. I think maybe you confuse me for a "NoSQL" partisan. I simply think an RDBMS is better suited to storing models than to doing all kinds of modeling and data processing.

Back to the topic of Python books: it wasn't my intention to throw "stop energy" at a book effort, but to point out that "Python and GIS" is a messy subject and potential readers are somewhat segregated in different camps. I'm hugely in favor of more, excellent narrative documentation.

Re: Where's the book?

Author: keith

I vote for starting a wiki instead of publishing a book. More people can access a wiki via the internet and it is easier/cheaper to maintain and update a wiki. A book might be out of date in a few years as software is updated. There is a huge variety of open source GIS tools that people use. I'd like to see work flow diagrams, recipes, code snippets, etc how real work is getting done. There is also IronPython (dotnet)and Jython (java) to consider along with Python.

Re: Where's the book?

Author: julia

A "Geospatial Python" wiki with evolving content from geospatial python experts that covers many of the various geospatial python implementations, including interacting with spatial data stores, would be a very welcome thing indeed to many people who are not geospatial python experts but need some good resources and examples - so we won't keep getting stuck in that learning curve....

Don't know that I'd characterize Open Source GIS as "the seedy side" of GISville. I think of it more as the "Company Town" side of GISVille being the snooty gated community where entry is controlled both by the size of one's bank roll and one's desire to pay a high entry fee for the "privilege" of feeling like one is part of some exclusive, superior society. The superiority thing is not true, but the "Company" really excels in marketing, so a lot of people believe the hype. The problem is that inside the gated community, the "Company Way" is the only allowed way to do things. Like an H.O.A. gone completely wrong :)

Many of us (who have been forced to live in the gated community for many years by our employers in the public sector) would like to start learning several different ways of doing similar tasks in GIS with python, and not limit ourselves to a single toolset. That same old "geoprocessing" hammer doesn't always do such a great job :) Knowing more than one way to solve the problem, or how to use more than one tool is a big help when deciding what method makes the most sense to use in any given situation. The aforementioned wiki would be very helpful to a lot of people as a learning aid and in helping to compare and contrast the benefits or drawbacks of employing different approaches.