I don't know a nicer way to say this: if your Python software requires you to
routinely delete objects, it is letting you down. I'm looking at you, ArcPy
cursors:
import arcpy
cur = arcpy.SearchCursor("roads", '"TYPE" <> 4'))
for row in cur:
print "Name: %s, CFCC code: %s" % \
(row.NAME, row.CFCC)
del cur, row
The ArcPy cursor locks resources and this lock is removed (evidently, or update 2011-02-02: maybe not so evidently, see the comments) in the
cursor object's __del__
method. This is a less than ideal design because,
as stated in the Python docs, del ob
does not immediately cause
ob.__del__()
to be called. To their credit, the ArcPy docs acknowledge this
issue:
When working in a Python editor, such as PythonWin, you may need to clean up
object references to remove dataset locks set by cursors. Use the gc (garbage
collection) module to control when unused objects are removed and/or
explicitly delete references within your script.
You're quite seriously failing your users if you make them turn to manual
intervention in the garbage collection cycle, but the problem of finalizing the
cursor state remains. What would be better? There's two obvious ways to go:
emulate Python files (or the Python DB API), or use PEP 343's with
statement.
In the first approach, one would add close
methods to cursors that
immediately and explicitly free up all locked resources. The end of the script
above becomes:
for row in cur:
print "Name: %s, CFCC code: %s" % \
(row.NAME, row.CFCC)
# This is too optimistic - finalizes only eventually, if ever
# del cur, row
# Better - finalizes immediately, let garbage collection delete it
# normally
cur.close()
The with
statement is more powerful, but more complicated. The problem PEP 343's with
statement try to solve is
explained by Fredrik Lundh [effbot.org]:
As most other things in Python, the with statement is actually very simple,
once you understand the problem it’s trying to solve. Consider this piece of
code:
set things up
try:
do something
finally:
tear things down
Here, “set things up” could be opening a file, or acquiring some sort of
external resource, and “tear things down” would then be closing the file, or
releasing or removing the resource. The try-finally construct guarantees that
the “tear things down” part is always executed, even if the code that does
the work doesn’t finish.
It's a generalization of the problem an ArcPy cursor user faces: locking a data source during a session and returning the data source to its proper state at the end of the session. With context management, ArcPy could let a user write safe, foolproof code like this:
with arcpy.SearchCursor("roads", '"TYPE" <> 4')) as cur:
for row in cur:
print "Name: %s, CFCC code: %s" % \
(row.NAME, row.CFCC)
# That's all - cursor finalizes when the 'with' block ends
Even without changes in ArcPy, a user can start dodging cursor reference trouble right now by writing a few lines of adapter code.
class make_smarter:
def __init__(self, cursor)
self.cursor = cursor
def __enter__(self):
return self.cursor
def __exit__(self, type, value, traceback):
# self.cursor.close() would be better
self.cursor.__del__()
with make_smarter(arcpy.SearchCursor("roads", '"TYPE" <> 4'))) as cur:
for row in cur:
print "Name: %s, CFCC code: %s" % \
(row.NAME, row.CFCC)
The make_smarter
class is a "context guard" for the cur
object. Its
__enter__
method is called when the while
block begins and its
__exit__
method is called when the block ends, for whatever reason. Python's file object, since version 2.5, implements this very same protocol.
A context guard for writing data with ogr could be equally useful and hide the the unfortunately named method you call to flush data to disk.
class make_smartogr:
def __init__(self, layer)
self.layer = layer
def __enter__(self):
return self.layer
def __exit__(self, type, value, traceback):
self.layer.Destroy()
Take it from me, because I've felt the pain during development of Shapely: relying on the __del__
method of an object like ArcPy does – or does not (see comments), maybe it counts references, which I equally recommend against – will burn you sooner or later. The with
statement adds dependability and safety to even the most straight forward data processing script.
Comments
Re: Command line JSON
Author: Seth G
It would be great to have a geojson validator (command line or web) along the lines of http://www.jsonlint.com/
I'm never sure if CRS (or SRS?) is required - GDAL seems to crash if it is missing. It would be nice to have a tool to check if there are missing coordinate values etc. (although loading with Shapely helpw with the latter).
Re: Command line JSON
Author: Sunng
python's json module can be considered as a formatting tool.
Recently I find a command line query tool for json called jshon, which is of great helpful.
http://kmkeen.com/jshon/index.html
Re: Command line JSON
Author: Howard Butler
Seth G,
File a bug on GDAL if its GeoJSON module is crashing in cases with or without SRS info. It shouldn't be required, and if it is crashing, it needs to be fixed.
Howard