Better Python Practices for the GeoWeb
It pains me to see novices taught poor Python programming practices, and so I can't resist making a few corrections to this post. Processing and marking up data into KML is a simple task that can be used to teach better practices. Here are 3 easy ones:
abstract access to file-like resources with urllib;
iterators and generators;
XML templating with Genshi;
Here is a script that reads in a FIRMS text file and writes out a KML document named fires.kml:
import urllib2 from genshi.template import TemplateLoader def collect_latest_fires(): f = urllib2.urlopen('file:N_America.A2007275.txt') for line in f: fields = line.split(',') lat = fields[0] long = fields[1] confidence = fields[8] coords = '%s, %s' % (long, lat) yield { 'name': 'Wildland Fire at %s' % coords, 'description': 'Confidence: %s' % confidence, 'coordinates': coords } loader = TemplateLoader(['.']) template = loader.load('template.kml') stream = template.generate(collection=collect_latest_fires()) f = open('fires.kml', 'w') f.write(stream.render())
The input data file is hard coded in the collect_latest_fires function, but it's trivial to calculate the name of the latest data (A well-managed site would probably call it current.txt). You can get the input data via FTP or HTTP by using the proper URI scheme:
>>> f = urllib2.urlopen('ftp://ftp.example.org/current.txt')
By virtue of the yield statement, collect_latest_fires is a generator, an iterator that keeps track of its own state and computes values on demand. Iterators are a key element in Python programming. Note that Python file-like objects are themselves iterators. If you needed an actual list of fires, you could create it from the generator like so:
>>> collection = list(collect_latest_fires())
This generator function is the bulk of the script. The remainder simply loads a Genshi template, generates an output markup stream using an iterator over the latest collection of fires, and then renders and writes the stream.
Here is the KML template:
<?xml version="1.0" encoding="utf-8"?> <kml xmlns="http://earth.google.com/kml/2.1" xmlns:py="http://genshi.edgewall.org/" > <Folder> <Style id="fireIcon"> <IconStyle> <Icon> <href>http://maps.google.com/mapfiles/kml/pal3/icon38.png</href> </Icon> </IconStyle> </Style> <Placemark py:for="item in collection"> <name py:content="item['name']">NAME</name> <styleUrl>#fireIcon</styleUrl> <description py:content="item['description']"> DESCRIPTION </description> <Point> <coordinates py:content="item['coordinates']"> LONG,LAT </coordinates> </Point> </Placemark> </Folder> </kml>
Use well-engineered templating systems and/or serializers to create KML. Do not concatenate hand-coded strings of angle brackets. Templates like the one above can be run through your favorite XML tools to insure that they are well-formed. You can't do that with your Python code. A good templating system also handles the encoding for you.
Finally, since the FIRMS source data is changing only once a day, you should be running the script above no more than once a day. Don't use it as a CGI. Transform the source data to KML and write it to a file under your web server. Configure your web server to provide the application/vnd.google-earth.kml+xml content type for the KML and also set the HTTP Expires header to the modification time of the file plus 24 hours. That's a recipe that scales.
Comments
Re: Better Python Practices for the GeoWeb
Author: Kristian
Bookmarked. Now I'm just waiting for part two, "TurboGears and Shapely vs. GeoDjango" ;)Re: Better Python Practices for the GeoWeb
Author: Sean
TurboGears and Django overflow with good practices, so no need for that sequel.