Iterators

Sean Gillies

2009-10-23 00:00

How do you get a GIS feature from a Python collection/layer/provider/thingy? Let's look at 4 different popular GIS scripting environments.

ESRI (ArcGIS 9.3):

row = rows.next()
while row:
    # work with row
    ...
    row = rows.next()

FME:

self.my_collection_count = len(self.my_feature_collection)
for i in range(self.my_collection_count):
    self.my_feature_part = self.my_feature_collection[i]

    # work with self.my_feature_part
    ...

Note: don't use instance attributes like that. Use local variables.

QGIS:

feat = QgsFeature()
while provider.getNextFeature(feat):
    # work with feat
    ...

The oddball.

OGR (earlier than 1.5):

feature = layer.GetNextFeature()
while feature:
    # work with feature
    ...
    feature.Destroy()
    feature = layer.GetNextFeature()

Destroy!

4 different environments, 4 different ways, and none of them the natural Python way. There's one obviously right way to do it for Python, and that's the way that it's done in ogr.py versions >= 1.5, and how it's done in WorldMill. GeoDjango, too.

for f in layer:
    # work with f
    ...

where layer is among other things a generator that provides the iterator protocol just like Python strings, lists, and files do. It has a next method that yields a value, or raises a StopIteration exception when there are no more values. The advantages:

Clarity: it's agonizingly clear. More clear to a non-programmer, in my opinion, than the other alternatives. For each feature in the set: do something. And then forget about the feature and move on to the next.
Less error-prone: even a non-programmer can't screw up that one line of code any worse than to get a standard, understandable Python NameError, TypeError, or SyntaxError.
Standardization: core Python modules such as itertools and many other useful add-on packages reward you for using the iterator protocol.

Comments

Re: Iterators

Author: few people

I think ESRI has done away with the while: row = rows.next() stuff at 9.3.1: http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=FeatureSets_and_RecordSets

Re: Iterators

Author: Mateusz Loskot

Sean, brilliant post! Finally, a voice clarity, correcness, robustness, discoverability, transparency, genericity, expressiveness, flexibility, consistency, meaning...

Quality Matters

Any of the interested parties are considering it in the same manner?

I'm afraid, that's the end of story.

Re: Iterators

Author: Michael Weisman

Just a quick clarification, FME does support iterators. The following snippet will iterate over the parts of an aggregate feature and extract the coordinates for each part:

for f in feature_collection:
  coords = f.getCoordinates()
  for m in coords:
    x = m[0]
    y = m[1]

Re: Iterators

Author: Mateusz Loskot

And a bit of clarification from me, I was referring to the Open Source projects mentioned above. Unfortunately, SWIG is in the house over there.

Re: Iterators

Author: Sean

Great, Michael. I simply didn't find examples of iterators employed by FME users. I haven't seen examples of programmers using Howard Butler's neater ogr.py API either.

QGIS

Author: Martin Dobias

Thanks for the nice article. Recently I've been thinking about adding a more pythonic way of accessing features in PyQGIS... currently it's just a crude wrapper around c++ api. So thanks for the inspiration :)

Re: Iterators

Author: Howard Butler

ogr.py is not mine. ogr.py is Frank's API ;), but I think I added support for this at GDAL 1.6. I guess we haven't advertised it so well.

for feature in layer:
    # COLUMN_NAME must be normalized to all caps due to some
    # data sources not caring.
    print feature.COLUMN_NAME

Python idioms for GIS Education

Sean Gillies

2009-10-20 00:00

I'd like to see GIS students taught to program in Python using Python idioms, not Avenue idioms. I hate to pick on Utah State's GIS Programming with Python just because of its popularity, but it contains some good introductory code that can be easily tuned up to teach even better Python GIS programming skills. For example, let's look at Lesson 5: Writing/Reading Text Files, Python Lists & Dictionaries, part 5a:

# Author:  John Lowry
# Date: Dec. 21, 2007
# Purpose: Lesson 5a: Use split and Write to a text file
#############################################################

#Import modules
import string

# Open a new text file to write to
outFile = open(r"C:\john\TeachingGIS\WILD6900_ArcGISPython\Lesson5_results\write_example.txt", "w")

# Make a string variable of featureclass names
fcString = "nests1990.shp,nests1995.shp,nests2000.shp"

# Make a list variable from the string variable using the split method, then print
fcList = fcString.split(",")

# Write each item in the list to a separate line the output file
outFile.write(fcList[0]+ "\n")
outFile.write(fcList[1]+ "\n")
outFile.write(fcList[2]+ "\n")

# Close the files
outFile.close()

There are 3 defects in that code:

It imports the string module but never uses anything from it. You should almost always use string object methods anyway.
It presumes knowledge of the number of items in the comma-separated string, specifically that it is at least 3.
It needlessly concatenates newlines to items before writing.

The equivalent code, with none of those defects, looks like this:

outFile = open('/tmp/out.txt', 'w')
for item in fcString.split(','):
    outFile.write(item)
    outFile.write('\n')
outFile.close()

In Python 2.6 you can use a with statement to make it even more compact. The file closes itself at the end of the block. And you can use file.writelines to write the data and newline in one go. It's more efficient to pass it a tuple than to pass it a list.

with open('/tmp/out.txt', 'w') as f:
    for item in fcString.split(','):
        f.writelines((item, '\n'))

In the next part of the lesson, 5b, we see:

# Author:  John Lowry
# Date: Dec. 21, 2007
# Purpose: Lesson 5:  Reading a and writing a textfile
#############################################################

# Open the text file in read mode
inFile = open(r"C:\john\TeachingGIS\WILD6900_ArcGISPython\Lesson5\nests2005_coords.csv", "r")

# Open a new text file to write to
outFile = open(r"C:\john\TeachingGIS\WILD6900_ArcGISPython\Lesson5_results\nests2005_format.txt", "w")

# Read entire file and print one line at a time
for line in inFile.readlines():
    nestList = line.split(",")
    id = nestList[0]
    cnd = nestList[1]
    x = nestList[2]
    y = nestList[3]
    outFile.write("Siteid: " + id + "\n")
    outFile.write("Condition: " + cnd + "\n")
    outFile.write("X Coordinate: " + x + "\n")
    outFile.write("Y Coordinate: " + y + "\n")
    outFile.write("\n")

# Close the files
inFile.close()
outFile.close()

The more effective version of the looping block is this:

for line in inFile:
    outFile.write(
      'Siteid: %s\nCondition: %s\nX Coordinate: %s\nY Coordinate: %s\n\n' \
      % tuple(line.split(','))
      )

String formatting is more efficient than string concatenation (or not -- see the update below) and you can avoid needless variable assignments by using the split results directly.

I blogged before about how smelly the ArcGIS scripting cursor syntax was. I hear it's better now, but you can still see the old style in the USU course code.

Update (2009-10-21): Here's my benchmark script. I'm isolating just the inner part of the loop and focusing just on the extra assignments and file writes.

import timeit

# Sample input line
line = '1,good,433207.8362,4518107.044'

# A file-like object
class MockFile(object):
    def write(self, line):
        pass

outFile = MockFile()

# GIS Style programming. Assignment to intermediate variables
# and each written separately.

s1 = """\
nestList = line.split(',')
id = nestList[0]
cnd = nestList[1]
x = nestList[2]
y = nestList[3]
outFile.write(id)
outFile.write(cnd)
outFile.write(x)
outFile.write(y)
"""

t1 = timeit.Timer(
    stmt=s1,
    setup='from __main__ import line, outFile'
    )
print "GIS style"
print "%.2f usec/pass" % t1.timeit()
print

# Idiomatic Python. No intermediate variables and all written as
# a group

s2 = """\
outFile.write(''.join(line.split(',')))
"""

t2 = timeit.Timer(
    stmt=s2,
    setup='from __main__ import line, outFile'
    )
print "Idiomatic Python"
print "%.2f usec/pass" % t2.timeit()
print

The results:

$ python benchmarks.py
GIS style
2.07 usec/pass

Idiomatic Python
1.29 usec/pass

Someone else has looked at string performance more closely than I, and it looks like I'm wrong. On my Python 2.6, too, concatenation wins over formatting. Use of join is faster than concatenation for lists longer than 1000 items or so.

Comments

Re: Python idioms for GIS Education

Author: Eric Wolf

Sean - you obviously haven't tried teaching Python to Geographers. I find myself explaining over and over again how

with

works to grad students who've been writing Python code regularly for their advisors. And as for breaking out the

write

statements, my friend who regularly TA's the graduate Python class at CU-Boulder and I had a contest to see which of us could write the program for a lab in the fewest number of lines. The end result: combining program functionality into fewer lines of code destroys the readability and maintainability of the code. In the example above, it's pretty clear that the output spans multiple lines because the code spans multiple lines. But your rewrite requires that the student remember that /n means "start a new line".

And as for dropping the ".readlines()", I'd rather see the student specify the action directly rather than assuming it "just happens". For people who work in multiple languages, over use of "magic variables" or "magic functions" only makes the code harder to understand.

So here's the big question: Are your code rewrites significantly more efficient than the original in terms of execution?

Re: Python idioms for GIS Education

Author: Regina

Eric,

Honestly -- you can't say the first example is better than the second. I don't care what language you program in, the first code snippet is fraught with flaws.

Coming from a SQL/VB.NET/PHP/C# mindset, that Sean's example is much more understandable and clearly better.

Now as for the second example, I see your point that someone new to Python may be a bit at a disadvantage understanding this -- just like anyone seeing SQL for th first time would be disadvantaged seeing a join statement. That doesn't mean I should stop writing join statements since that is the strength of the language.

To some extent I would argue though that a fundamental reason aside from libraries available for it, are the succintness of its idioms. If you are going to write Python code, write Python code like a Python programmer. Anything else would just be insulting.

Re: Python idioms for GIS Education

Author: Sean

If you write slow code, you get a slow program. If you make a lot of needless assignments, you get a big, slow, program. It's not just about succinctness of the source. The Python idioms are usually backed up by optimized code and execute faster than pseudocode. How much faster? I'll benchmark it.

My first two examples are much more readable and maintainable than the original. That's a long string template in the last, granted, but there are constructions that can make it more readable. Python's triple-quoted text, for example:

template = '''\
Siteid: %s
Condition: %s
X Coordinate: %s
Y Coordinate: %s
'''

for line in inFile:
    outFile.write(template % tuple(line.split(',')))

I've taught Python to geographers a few times. I even remember being thanked and told that the lesson helped. Most of them were open source nuts and hackers at heart, it's true.

Re: Python idioms for GIS Education

Author: Clemens

Does your second suggestion really work? I'm still on Python 2.5 and I need to convert the result of the split to tuple in order to avoid "TypeError: not enough arguments for format string".

Re: Python idioms for GIS Education

Author: Sean

Indeed. Thanks, Clemens. I've corrected it.

Re: Python idioms for GIS Education

Author: Sean

I've previously encountered the opinion that non-idiomatic code is better for GIS, but I think it's misguided. Write Ruby like it is intended to be written. Write Python like it is intended to be written. One such reference for Python I've found is David Goodger's Code Like a Pythonista: Idiomatic Python. I'd love to see this near the top of Python GIS programming course reading lists.

It's not as if, to use a software-carpentry analogy, USU students are being taught to drive nails with the claw end of the hammer. The course is pretty good. But it's not quite teaching students how to drive nails with the least number of blows.

Re: Python idioms for GIS Education

Author: Chris

Valid points, Sean.

But I thought I'd point out that most of the students that take this course know absolutely nothing about programming. It's hard enough getting basic concepts across -- I wouldn't want to try to explain the string formatting to them when some of them are lucky to even get stuff to print out right. Also, John (the guy who wrote this code) is not a programmer. But he likes tinkering with code occasionally and thought it would be fun to teach a class. Which was fine by me, because it meant I got to teach the open source part of the course. And I'm sure there are problems with my code, too...

Re: Python idioms for GIS Education

Author: Eric Wolf

As Chris said, the students encountered in geography programs (including grad students) commonly know absolutely nothing about programming. Extending this comment:

1. These students made it to college (even grad school), they aren't stupid... but

2. They arrived at that level without knowing how to program.

So, why is it that intelligent people can reach such a high level of education without learning to program? Is it because they weren't exposed to programming in elementary school (maybe)? Or is it that these people aren't drawn programming (likely)?

In the case of people not having exposure, some of these people pick up programming quite quickly. They seem to be successes and are the type of student who appreciate learning how to optimize code (for speed, clarity, etc). But the other case of people, they don't need to learn to optimize code. They have an approach to problem solving that works - but doesn't necessarily fit with programming. They likely need to write a few scripts to automate geoprocessing for their problems. This second class of students are well-understood and targeted by ESRI. Hence Avenue!

On the positive side, these students aren't going to develop large programs in Python. So a little code bloat isn't going to bother them. They are also comparing their scripts to how well the same methods run in the ArcToolbox. The speed bar is set pretty low.

I really appreciate your examples. I would include them in a text book for Geography students - but I would present the clunky Avenue-style code first. For the students who do want to make their code more efficient, they can look at the refined code. But I sure as heck wouldn't start off trying to teach geography students how to right optimal Python code.

Unwords

Sean Gillies

2009-10-20 00:00

I've learned that in German, there's a name for a disastrous string of characters, such as "REST endpoint", that insults human dignity: unwort. Yes, "REST endpoint" is two words in English, but it would be one in German. There's even an "Unword of the Year" website devoted to these entities (in German only). The level of negative meaning in "REST API" is almost as high.

Via @xrotwang.

Comments

Re: Unwords

Author: James Fee

OK, so what is RESTful in French?

Re: Unwords

Author: Sean

In French, the mec in the stock photo recliner would be having a sieste tranquille. Or, il glande. Depends on your frame of reference. French systems architects write "REST". What would a French standards body write? I don't know. Maybe "transport d'état aux representations"?

My point is that "REST endpoint" and "REST API" are unterms with negative meaning. Protracted use will probably result in brain damage.

BTW, loved your geodata site scraping anecdote. That takes me back.

Old Europe exhibit at ISAW

Sean Gillies

2009-10-19 00:00

http://farm3.static.flickr.com/2432/4025718700_8fe6b90044_o.jpg

Check this out if you're in New York City this winter. It looks fascinating:

The Lost World of Old Europe brings to the United States for the first time more than 160 objects recovered by archaeologists from the graves, towns, and villages of Old Europe, a cycle of related cultures that achieved a precocious peak of sophistication and creativity in what is now southeastern Europe between 5000 and 4000 BC, and then mysteriously collapsed by 3500 BC. Long before Egypt or Mesopotamia rose to an equivalent level of achievement, Old Europe was among the most sophisticated places that humans inhabited. Some of its towns grew to city-like sizes. Potters developed striking designs, and the ubiquitous goddess figurines found in houses and shrines have triggered intense debates about womens roles in Old European society. Old European copper-smiths were, in their day, the most advanced metal artisans in the world. Their intense interest in acquiring copper, gold, Aegean shells, and other rare valuables created networks of negotiation that reached surprisingly far, permitting some of their chiefs to be buried with pounds of gold and copper in funerals without parallel in the Near East or Egypt at the time. The exhibition, arranged through loan agreements with 20 museums in three countries (Romania, The Republic of Bulgaria and the Republic of Moldova), brings the exuberant art, enigmatic goddess cults, and precocious metal ornaments and weapons of Old Europe to American audiences.

Working and playing at CBGP

Sean Gillies

2009-10-16 00:00

This fall I'm working in office space graciously provided by the Centre de Biologie et de Gestion des Populations at the CIRAD International Campus de Baillarguet outside Montferrier-sur-Lez [OpenStreetMap]. My wife is a visiting scientist here. I'm baggage, but extremely well-treated baggage. I share a corner office with a population geneticist who is waiting for his space in the new wing to be completed. There are bioinformatics programmers diving into Python next door. The internet mostly (c'est le Sud) works, and everybody is friendly and tolerant of my mediocre (terrible, really) French.

http://farm4.static.flickr.com/3519/4016516702_2c091dccb1_d.jpg

That's an image capture from 2009-10-16. Yes, I've heard about the OSM static map API, just waiting till it moves to what looks like a more permanent URL.

CBGP is out in the countryside north of the Montpellier metro. I've alluded to how pleasant it is out here and recently took a camera on my run to try to capture the feeling. Here's a view out of my office window of the parking lot and the Pic St-Loup hill to the north. One of my favorite local wineries is tucked under the cliffs on the other side of it.

http://farm4.static.flickr.com/3530/4015730981_588a03d008.jpg

A semi-wild, wooded ridge runs for some kilometers to the east of CBGP. Here's the steep trail (which I've edited into OSM) up the hill from the back door of the Centre. I'm sorry that I can't identify the trees (update: there's a least two p. pinea, or stone pine, in the photo). There's an interesting mix, including what appear to be firs near the top. A little elevation makes a big difference around here.

http://farm3.static.flickr.com/2742/4015730985_8a5960e0fc.jpg

At the top of the ridge runs a national bike route. The markers look like this:

http://farm3.static.flickr.com/2502/4015730987_a9dcf3ac1e.jpg

Single-track and fire road on the crest. Sometimes I forget I'm in France and feel like I'm back on the Front Range.

http://farm3.static.flickr.com/2562/4015730993_f8b3b7f187_o.jpg

You may have noticed the group of identical conifers on the right side of the last photo. There was a 50 hectare fire here in 1981. André Jeanjean, A young volunteer sapeur-pompier, died fighting the blaze. I found fresh flowers from the Mairie of Clapiers at his memorial.

http://farm3.static.flickr.com/2705/4015730997_0694afe878.jpg

The left fork there runs down to a vineyard and on into Clapiers. I desperately need to acquire a mountain bike because CBGP is a great trailhead for some nice long rides.

Comments

Re: Working and playing at CBGP

Author: Paolo

Glad that you are enjoying Europe, Sean! ;-)

Shapely, map, and reduce

Sean Gillies

2009-10-15 00:00

Not the Google MapReduce, that is, but Python's good old map() and reduce(). Waiting on something else this afternoon and just want to write this modestly neat code down for myself.

>>> from shapely.geometry import Point
>>> def union(a, b):
...     return a.union(b)
...
>>> print reduce(union, map(Point, zip(range(3), range(3)))).wkt
MULTIPOINT (0.0000000000000000 0.0000000000000000, 1.0000000000000000 1.0000000000000000, 2.0000000000000000 2.0000000000000000)

Update (2009-10-16): Could be made even shorter using lambda.

>>> from shapely.geometry import Point
>>> print reduce(lambda a, b: a.union(b), map(Point, zip(range(3), range(3)))).__geo_interface__
{'type': 'MultiPoint', 'coordinates': ((0.0, 0.0), (1.0, 1.0), (2.0, 2.0))}

Python and GIS 101

Sean Gillies

2009-10-14 00:00

Reading that Chris Garrard was open to suggestions for improving her Geoprocessing with Python using Open Source GIS section at Utah State, the first thing that came to mind was that there could be a lesson on working with object-oriented and functional programming paradigms and how to take advantage of Python's support for each of them. Not just how to use instances, but how to write a useful class. How to use iterators and generators. A lesson about how to start to move from simple scripting, and the inevitable folders of replicating, mutating files named processing.py, processing2.py, processing2b.py, and so on (believe me, I've been there, only it was EASI scripts), to modules of reusable classes and functions. I've tried to come up with a simple example using the conventional GIS geometry model, something that will be familiar to a GIS analyst.

Actually, I wasn't considering functional programming at all when I started this, but it's obviously very helpful to a GIS programmer. All those nested loops one uses to loop over the elements of rasters? There's a Python function for that. This has been an exercise for me, too. I've found some recipes in Python's itertools module that I've overlooked for too long and are going to immediately improve a few of my own projects. For better or worse, Python lets you mix the object-oriented and functional paradigms, and I do that in the following walk-through of yet another Point class for Python.

The C or C++ library under your favorite Python bindings already provides a point class, so why reinvent the wheel? One day (I'm not predicting when) geometry and topology operation code written entirely in Python will execute quickly enough on enough different platforms to be ready for serious work. Imagine we're coding toward that future. In the meanwhile we'll see how to get more out of today's Python, and dabble in doctest, json, array, and itertools.

I've used Python 2.6 for this exercise, and nothing from outside the standard library. The source for the completed pygis_101 module is at http://bitbucket.org/sgillies/python-gis-101/src/.

Let's start with module containing a do-nothing Point class and a test runner.

# Module: pygis_101.py
# We'll stick to plane geometry

class Point(object):
    """The building block of higher dimensional geometries."""

# and introduce a doctest test runner to the end of the module

def _test():
    import doctest
    doctest.testmod()

if __name__ == "__main__":
    _test()

You can run all doctests in the module if you execute it as a script.

$ python pygis_101.py -v
3 items had no tests:
    __main__
    __main__.Point
    __main__._test
0 tests in 3 items.
0 passed and 0 failed.
Test passed.

Now, let's add a docstring with tests.

class Point(object):
    """The building block of higher dimensional geometries.

    This point class has 2 coordinates, which we'll call `x` and `y`:

        >>> point = Point(0.0, 0.0)
        >>> point.x
        0.0
        >>> point.y
        0.0
    """

Run the tests again

$ python pygis_101.py -v
Trying:
    point = Point(0.0, 0.0)
Expecting nothing
**********************************************************************
File "pygis_101.py", line 8, in __main__.Point
Failed example:
    point = Point(0.0, 0.0)
Exception raised:
    Traceback (most recent call last):
      File "/Users/seang/code/python-macosx/parts/opt/lib/python2.6/doctest.py", line 1231, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.Point[0]>", line 1, in <module>
        point = Point(0.0, 0.0)
    TypeError: object.__new__() takes no parameters
Trying:
    point.x
Expecting:
    0.0
**********************************************************************
File "pygis_101.py", line 9, in __main__.Point
Failed example:
    point.x
Exception raised:
    Traceback (most recent call last):
      File "/Users/seang/code/python-macosx/parts/opt/lib/python2.6/doctest.py", line 1231, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.Point[1]>", line 1, in <module>
        point.x
    NameError: name 'point' is not defined
Trying:
    point.y
Expecting:
    0.0
**********************************************************************
File "pygis_101.py", line 11, in __main__.Point
Failed example:
    point.y
Exception raised:
    Traceback (most recent call last):
      File "/Users/seang/code/python-macosx/parts/opt/lib/python2.6/doctest.py", line 1231, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__.Point[2]>", line 1, in <module>
        point.y
    NameError: name 'point' is not defined
2 items had no tests:
    __main__
    __main__._test
**********************************************************************
1 items had failures:
   3 of   3 in __main__.Point
3 tests in 3 items.
0 passed and 3 failed.
***Test Failed*** 3 failures.

We'll need to override object.__init__() in order to pass in coordinate values when we create point instances. The values will be stored in instance attributes.

class Point(object):
    """The building block of higher dimensional geometries.

    This point class has 2 coordinates, which we'll call `x` and `y`:

        >>> point = Point(0.0, 0.0)
        >>> point.x
        0.0
        >>> point.y
        0.0
    """

    def __init__(self, x, y):
        self.x = x
        self.y = y

Now, the 3 tests pass

$ python pygis_101.py -v
Trying:
    point = Point(0.0, 0.0)
Expecting nothing
ok
Trying:
    point.x
Expecting:
    0.0
ok
Trying:
    point.y
Expecting:
    0.0
ok
3 items had no tests:
    __main__
    __main__.Point.__init__
    __main__._test
1 items passed all tests:
   3 tests in __main__.Point
3 tests in 4 items.
3 passed and 0 failed.
Test passed.

With that brief introduction to Python classes, test-driven development, and doctest finished, let's move on to making the Point class more useful.

Often, one wants to serialize geometries for use with other programs. One useful format is "well-known text", or "WKT". Let's add the following to the Point class:

def toWKT(self):
    """Returns a WKT representation of a point.

        >>> point = Point(0.0, 0.0)
        >>> print point.toWKT()
        POINT (0.000000 0.000000)
    """
    return 'POINT (%f %f)' % (self.x, self.y)

JSON is a rising format. Python and JSON go together like sausage and beer. You might be tempted to produce JSON by string interpolation, as we did for WKT, but it's best to use the json module included with Python 2.6, or simplejson installed separately for earlier Python versions. Don't forget to import json at the top of the module.

def toJSON(self):
    """Returns a GeoJSON representation of a point.

        >>> point = Point(0.0, 0.0)
        >>> print point.toJSON()
        {"type": "Point", "coordinates": [0.0, 0.0]}
    """
    return json.dumps(
        dict(type='Point', coordinates=(self.x, self.y))
        )

If we add a __geo_interface__ attribute (implemented as a property) that returns a dict like the one above, we could even use the point objects with the Shapely and GeoJSON packages.

@property
def __geo_interface__(self):
    """Returns a GeoJSON-ish dict representation of a point for interop
    with some other Python + GIS software.

        >>> point = Point(0.0, 0.0)
        >>> point.__geo_interface__
        {'type': 'Point', 'coordinates': (0.0, 0.0)}
    """
    return dict(type='Point', coordinates=(self.x, self.y))

def toJSON(self):
    """Returns a GeoJSON representation of a point.

        >>> point = Point(0.0, 0.0)
        >>> print point.toJSON()
        {"type": "Point", "coordinates": [0.0, 0.0]}
    """
    return json.dumps(self.__geo_interface__)

Now we've got 9 passing tests.

$ python pygis_101.py -v
...
4 items passed all tests:
   3 tests in __main__.Point
   2 tests in __main__.Point.__geo_interface__
   2 tests in __main__.Point.toJSON
   2 tests in __main__.Point.toWKT
9 tests in 7 items.
9 passed and 0 failed.
Test passed.

And, btw, 100% code coverage:

$ coverage run pygis_101.py
$ coverage report
Name        Stmts   Exec  Cover
-------------------------------
pygis_101      16     16   100%

A geometry object ought to be able to provide its bounding box. In order to show how to do this for many more kinds of geometries, let's rewrite Point's coordinate storage using arrays, replace the x and y attributes with dynamic properties for convenience, and add a move method for shifting the point to new coordinates. Then we get the bounding box by checking the minimum and maximum values of the coordinate arrays. This technique will work for just about any type of geometry. Use from array import array at the top, of course.

def __init__(self, x, y):
    self.xs = array('d')
    self.xs.append(x)
    self.ys = array('d')
    self.ys.append(y)

@property
def x(self):
    return self.xs[0]

@property
def y(self):
    return self.ys[0]

def move(self, x, y):
    """Move point to new coordinates x, y.

        >>> point = Point(0.0, 0.0)
        >>> point.move(1.0, 1.0)
        >>> point.x
        1.0
        >>> point.y
        1.0
    """
    self.xs[0] = x
    self.ys[0] = y

@property
def bounds(self):
    """Returns the bounding box as a (minx, miny, maxx, maxy) tuple.

        >>> point = Point(0.0, 0.0)
        >>> point.bounds
        (0.0, 0.0, 0.0, 0.0)
    """
    return min(self.xs), min(self.ys), max(self.xs), max(self.ys)

All production geometry-for-GIS software I know implements these kinds of coordinate sequences. To get coordinate pairs from the sequences, operate on the coordinate arrays with Python's built in zip method or itertools.izip, which I mention because the itertools documentation is a motherlode of fast and efficient iteration recipes.

>>> from array import array
>>> xs = array('d', [0.0, 1.0, 2.0])
>>> ys = array('d', [0.0, 1.0, 2.0])
>>> from itertools import izip, islice
>>> list(izip(xs, ys))
[(0.0, 0.0), (1.0, 1.0), (2.0, 2.0)]
>>> list(islice(izip(xs, ys), 1, None))
[(1.0, 1.0), (2.0, 2.0)]

Zip, btw, is its own inverse.

>>> zip(*zip([1,2,3], ['a', 'b', 'c']))
[(1, 2, 3), ('a', 'b', 'c')]

Going further, we can consolidate the toWKT and toJSON methods in a way that also works very efficiently for lines or paths (using list() instead of next() in that case):

@property
def coords(self):
    """Iterate over x,y pairs.

        >>> point = Point(0.0, 0.0)
        >>> next(point.coords)
        (0.0, 0.0)
    """
    return izip(self.xs, self.ys)

def toWKT(self):
    """Returns a WKT representation of a point.

        >>> point = Point(0.0, 0.0)
        >>> print point.toWKT()
        POINT (0.000000 0.000000)
    """
    return 'POINT (%f %f)' % next(self.coords)

@property
def __geo_interface__(self):
    """Returns a GeoJSON-ish dict representation of a point for interop
    with some other Python + GIS software.

        >>> point = Point(0.0, 0.0)
        >>> point.__geo_interface__
        {'type': 'Point', 'coordinates': (0.0, 0.0)}
    """
    return dict(type='Point', coordinates=next(self.coords))

Now we've got 17 passing tests and still 100% coverage.

$ coverage run pygis_101.py
$ coverage report
Name        Stmts   Exec  Cover
-------------------------------
pygis_101      31     31   100%

Next chance I get, I'll dig into some of my old blogged code to see what might be better done in a more functional style.

Shapely 1.0.14 released

Sean Gillies

2009-10-06 00:00

This release fixes a bug in writing out WKB on 64-bit systems. It was solved by a MapFish user. If you are one, or are writing WKB to PostGIS or another application, an upgrade is recommended. I've uploaded an sdist to http://pypi.python.org/pypi/Shapely/1.0.14.

The HTML of the "GeoWeb"

Sean Gillies

2009-10-01 00:00

HTML is the "HTML of the GeoWeb". From Mano Marks:

For a basic Google Maps mashup (or Bing, or Yahoo! Maps too), you need at least HTML, CSS, and JavaScript. More complicated mashups will use KML, GeoRSS, or GeoJSON, maybe Flash instead of JS, and wait. Notice the first one of those technologies, HTML. Still fundamentally, the GeoWeb relies on HTML. Sure, with a GeoWeb browser like Google Earth, you don't need HTML, though you can use it in the description balloons. But most mashups require HTML as the carrier.

Actually, you do still need HTML with Google Earth. Lacking forms or other intrinsic publishing mechanisms, KML is a format for read-only applications. Theoretically, one might put an HTML form in a placemark bubble and use that to modify the state of the resource behind the KML representation. Or javascript to modify the resource via XHR like many of us do with OpenLayers. Unfortunately, Google Earth doesn't surface forms in popups or fully support javascript. We'll have to wait for more convergence between Earth and Chrome, perhaps. Meanwhile, I'm making users hop between Earth and their browser. It's easy enough to cut and paste the placemark's KML text, but imagine what you could do with javascript and DOM-like access to the nodes in a KML doc.

Comments

Re: The HTML of the

Author: Brian Flood

hi sean

as of GE 5.0, you can use iFrames in the HTML balloons to get full access to the browser/DOM. it works well for editing data directly in the GE interface (even though its just a shortcut to what you're doing now)

cheers

brian

Re: The HTML of the

Author: Sean

Thanks for the tip, Brian. Better update, I guess, and get over the itchy feeling that iframes give me.

Re: The HTML of the

Author: Brian Flood

yea, wish I had better solution then iFrames. I remember there was some debate in the GE beta forums about native balloons supporting full html/js/flex and the outcome was a compromise using iFrames. the good part about it was at least you could rely on the balloon content acting in a well-known manner (caching, rendering, js issues etc) since its based on webkit (oddly enough, I think the embedded browser (below the globe) *may* use the machines default browser so you can end up with 2 different caches. go figure.)

in A2E, all the datasource feature resources have a small attribute editor built in that uses this technique for GE/GMaps (or anything really). the other cool part is since the iFrame is a real browser, all of your login/security goo works as it normally would

cheers

brian

Re: DOM-like access to the nodes in a KML doc

Author: Klokan Petr Pridal

Hi,

I just wanted to mention Google Earth API - where you have full access to the DOM of KML documents from JavaScript. Have a look at the documentation: http://code.google.com/apis/earth/articles/domintro.html.

and example of KML DOM manipulation via JavaScript:

http://earth-api-samples.googlecode.com/svn-history/r52/trunk/demos/dom-tree/index.html

Recently this was simplified by Roman Nurik (Google employee) via the Google Earth Utility Library (GEarthExtensions):

http://code.google.com/p/earth-api-utility-library/.

Of course, you need Google Earth Plugin for your web-browser - which is in this moment available for Windows and Mac (so no Linux yet :-().

Re: The HTML of the

Author: Sean

Right, this a good direction, but I can't tie my project to a proprietary plugin. And, again, this underlines how much we need HTML and javascript to complement KML.

Neo-paleo-geo

Sean Gillies

2009-09-30 00:00

body_html_template

Comments

Re: Neo-paleo-geo

Author: darkblue_b

oh you've got this backwards... Its *not* geography, nor is it GIS, when you no longer need to measure in inches, nor be reference book correct. There is a new dimension emerging of spatially enabled data on the web, which may be usefully loose, or in aggregate only partially correct.

Yes, we do need new vocabulary amongst technical practitioners - all the more so since you just wrote this post...