Why not Atom-powered repositories?

Sean Gillies

2008-11-21 00:00

Curation of digital projects and media seems to be in my future. I'm not a repository critter of any kind, and have much to learn, but I know for certain what kind of architecture I want to use. Take the indexable, mashable feed (Atom, naturally) representations of collections that Leslie Carr writes about, pair with upload of Atom-wrapped items to the same collections ala SWORD, and then add hypertext-constrained methods for item and metadata modification and deletion. In a word: AtomPub. Less like EPrints, DSpace, or Fedora; more like Google's data APIs, more like what Peter Keane is doing with DASe. Why not?

Shapely 1.0.11 and onward

Sean Gillies

2008-11-20 00:00

If I were superstitious, I wouldn't post a link to http://pypi.python.org/pypi/Shapely/1.0.11. Tests pass, the release candidate checked out, it's good to go. There's a Windows installer (containing GEOS 3.0.0) as well as generic sdist.

GEOS 3.1 is now on the horizon. The next minor version of Shapely – presuming all its users haven't been shed – will use prepared geometries to make iterative operations rip, and we'll continue to refactor and improve the code. Integration with Numpy is not as efficient as it could be, and I'd like to switch to a cleaner implementation of memoization using a decorator. I've been asked if there will ever be a version of Shapely that works with Google's App Engine. Unless Google builds GEOS (or something like it that could be accessed through ctypes) into the environment, such a thing would require rather a lot of Python programming. I suspect an automated translation of JTS from Java to Python would produce sub-optimal code. In other words: doubtful, but patches are gratefully accepted.

HTTP caching explained

Sean Gillies

2008-11-19 00:00

Even if you already know, you'll likely appreciate Ryan Tomayko's explanation of the things caches do. Show it to your boss, your marketing people, your GIO. After Bullying Awareness Week ends, I'm going to come back and write about GIS services in this context.

Maps, France 1944

Sean Gillies

2008-11-18 00:00

Yesterday, I received in the mail my grandfather's cloth escape maps of France, Holland, Belgium, Luxembourg, and Germany. He flew a Piper Cub for the US Army, mainly shuttling brass between England and France. Never used, the maps went from a pocket in his jacket to an envelope in a foot locker; they're in great condition.

http://farm4.static.flickr.com/3185/3041406288_2d4eaf05bd_d.jpg

Above is a 1:2,000,000 scale map entitled "Zones of France", second edition, and dated "MAR 44". Silk or rayon. I know little about the origins of this one.

http://farm4.static.flickr.com/3042/3040723569_f403e6e398_d.jpg

It features 4 hand-drawn "placemarks" South of Cherbourg. Positions of forces? Airfields supplied with beer?

http://farm4.static.flickr.com/3292/3039466501_5c36eab3f0_d.jpg

I've quickly found much more information about the 1:1,000,000 scale 43 Series map above. It is most likely rayon, as silk was then in short supply. Map C on one side, D on the reverse.

I'll be having these framed soon.

Comments

Re: Maps, France 1944

Author: Guillaume

That's a very interesting set of maps indeed ! I think hand-drawn placemarks are German troops positions, as Canisy, south of Saint-Lô was an important part of Atlantic Wall. Saint-Jean de Daye has been an important victory of allied forces, on the 7th of july 1944. Avranches has been a major battle too, but later, around the 31st. Thanks for sharing this !

Re: Maps, France 1944

Author: Sean

Thanks for the context, Guillaume.

Re: Maps, France 1944

Author: Patrick

Interesting post. Just to let you know that these kind of maps, digitally produced on tafetta (artificial silk) are being used by the Netherlands Armed forces (SF and pilots) today in Afghanistan. The maps still serve the same purpose as 60 years ago during WWII. There is a bit more survival information printed on them. Regards

Re: Maps, France 1944

Author: Jack

Speaking of survival, take a look at the DoD Evasion Chart (EVC) "produced on a strong, moisture-resistant polyester material (spin-bonded olefin)"--purportedly an important part of Air Force Captain Scott O'Grady's survival after being shot down in Bosnia: http://www.nga.mil/portal/site/nga01/index.jsp?epi-content=GENERIC&itemID=17386591e1b3af00VgnVCMServer23727a95RCRD&beanID=1629630080&viewID=Article Note also the background note concerning the EVC: "The history of the [EVC] goes back to charts printed on rayon during the 1940s, and to cloth 'blood chits' printed in various languages that identified American airmen and offered rewards for safe passage during World War II and the Korean and Vietnam conflicts."

Re: Maps, France 1944

Author: Sean

Anybody recognize, or better yet: have a reference for, the symbology in the "Zones of France" detail image?

Re: Maps, France 1944

Author: Guillaume

You have the three main zones : annexed zone at north-east, dedicated to future colonization by german settlers, the occupied zone in north and west, including Paris, and the "free" zone, in the south, which will stay under direct franch govermenment of Vichy until november 11th 1942 (full occupation). I guess blue lines along the coasts show restricted zones of security, to avoid a sea attack from the allies. You have some good infos (in french...) here : http://fr.wikipedia.org/wiki/Zone_libre Regards, Guillaume

Re: Maps, France 1944

Author: Sean

Guillaume, I'm thinking specifically about the placemarks. I think I've found an explanation at http://www.globalsecurity.org/military/library/policy/army/fm/101-5-1/f545-c4a.htm. Unless somebody used this map to track the location of a particular friendly unit in the campaign, you're right about it being pre-landing German positions. Here's a map showing German units arrayed diagonally across the peninsula on 6 June 1944: mp1.jpg.

Re: Maps, France 1944

Author: Guillaume

Indeed, but it highly depends of the precise date on which the placemarks have been drawn. Things moved quickly during that summer, and what was a German position soon became an allied one !

Re: Maps, France 1944

Author: MatthieuR

I confirm all the infos given by Guillaume, and I also agree that it is interesting information in a very good quality shape !!! Keep this maps with you, they may become really researched in a couple of years !

Shapely 1.0.10

Sean Gillies

2008-11-18 00:00

Update (2008-11-19)

Still having problems with 1.0.10 + Ubuntu (Debian) libgeos-c1 2.2.3-*, but at least I understand what's broken and have a 1.0.11 release candidate. You can try it like so:

$ easy_install http://trac.gispython.org/lab/attachment/ticket/178/Shapely-1.0.11.tar.gz?format=raw

Well, let's try that again: http://pypi.python.org/pypi/Shapely/1.0.10. Somewhere along the way I became too lax about testing compatibility with GEOS 2.2.3, and it was broken in the 1.0.8 release. I've mentioned before how handily zc.buildout makes isolated repeatable environments, and am now using this one with GEOS 2.2.3.

Upload of Windows installers to PyPI seems to be broken at the moment. The 1.0.8 installer bundles GEOS 3.0 and remains free of the recent problem: http://pypi.python.org/packages/2.5/S/Shapely/Shapely-1.0.8.win32.exe.

Shapely 1.0.9

Sean Gillies

2008-11-17 00:00

Shapely 1.0.9 works with a MacPorts libgeos. No need to upgrade otherwise.

Comments

Re: Shapely 1.0.9

Author: Allan Doyle

Cool. I hadn't actually realized MacPorts had libgeos. What other goodies have I been missing? ... goes to poke around in MacPorts...

Re: Shapely 1.0.9

Author: Sean

Let me know how it goes.

GeoJSON is not hypermedia

Sean Gillies

2008-11-13 00:00

The GeoJSON working group chose to omit links from the specification (outside of coordinate reference systems). In conclusion, GeoJSON 1.0 is not a hypermedia format. Without links there are no levers of application state to be seized, no hypertext constraint, and therefore no REST.

Consider, as an example of a hypermedia format, Atom and extensions in AtomPub: "alternate", "related", "self", and "edit" links are designed to satisfy REST's hypertext constraint and permit hypertext to be used as the engine of application state. Without links of the edit kind (especially), HTTP Geo-CRUD protocols using GeoJSON couple clients and servers together, an undesirable property in a system like the Web. This is not to say that coupling is going to kill your applications, just that the components don't have much freedom to evolve (think migrate or upgrade) separately.

Are there any good linking options for JSON? Subbu Allamaraju explores that question here and here and floats an object not unlike the only link example in GeoJSON. Both are inspired by atom:link and HTML's link.

Should GeoJSON become a hypermedia format? I don't know the answer to that, but I think it's more likely that GeoJSON geometry (and maybe feature) objects will find their way into other yet-to-emerge JSON-based media types.

Comments

Re: GeoJSON is not hypermedia

Author: Guillaume

I've never thought GeoJSON as an hypermedia format. IMHO, it's an interchange format, like old MIF/MID, designed for web apps, and describing geo features, easy to generate from server side, easy to read on client side. If hypermedia is needed somewhere, couldn't geoJSON be embedded in a atompub stream ? Or couldn't the atompub contains a link towards a geoJSON ressource ?

Re: GeoJSON is not hypermedia

Author: Subbu Allamaraju

@Guillaume: IMO, the question isn't whether a format like GeoJSON is a hypermedia format or not. To improve loose-coupling and discoverability of the contract, it is necessary to explore ways to enhance representations with runtime linking, and by adding links to JSON, you can make it a hypermedia format. Similarly, one might see PDF as a binary format, but it is actually a format that allows hyperlinking, and hence is a hypermedia format.

Re: GeoJSON is not hypermedia

Author: Sean

Thanks for the comment, Subbu. Guillaume, I'm seeing references to "RESTful" APIs using GeoJSON (I'd rather not single any one out) and felt it worth pointing out that such a thing is technically not possible, and specifically what GeoJSON lacks.

Nearest book

Sean Gillies

2008-11-13 00:00

A book excerpt meme is propagating through Python blogs. Why not?

If it should become necessary to reconsider the whole matter

William Strunk Jr. and E. B. White, The Elements of Style, 4th edition. The phrase demonstrates how to recast awkward usage of a possessive participle. I reread this book about once a season in my never-ending quest to suck less as a writer.

The viral part:

Grab the nearest book.
Open it to page 56.
Find the fifth sentence.
Post the text of the sentence in your journal along with these instructions.
Don’t dig for your favorite book, the cool book, or the intellectual one: pick the CLOSEST.

Python logging

Sean Gillies

2008-11-11 00:00

Update (2009-09-29): see http://plumberjack.blogspot.com/2009/09/python-logging-101.html

A friend at the local Park Service office mentioned to me that he's become a Python user and has been looking at some of my published code examples. I'm inspired to post more of these, and more that are useful to the ESRI ArcGIS scripting user. Python serves the ESRI community as a replacement for Avenue, but is a far more rich language and platform. Today's tip: excellent application logging and how to avoid using print statements in your production code.

I tripped myself using print statements recently, and have seen it in the code of others. While not a mortal sin, it does reduce the code's potential for reuse. A print statement might be tolerable when you're running the code at a shell prompt, but what if you wanted to execute the same code in a service? Better not to fill your service logs (or wherever stdout goes) with unnecessary debugging noise, yes? Here's where Python's logging module steps in to help: your information-processing classes write messages to a logging interface, any Python application then configures the logger appropriately. Log messages can be sent to single files on disk, rotating log files, sent to SMTP servers -- you could probably even create your own XMPP handler.

Say, for example, you have a class that patches elements into XML files (inspired by Dave Bouwman's post). Using a Python logger instead of a print statement requires only 2 extra lines of code (zero if you use the root logger):

import getopt
import glob
import logging
import sys
from lxml import etree

# Create a logger
log = logging.getLogger('file-patcher')

# Example of reusable code
class FilePatcher(object):
    """
    Patch edit permissions into an XML configuration file and write new
    files to disk.
    """
    def __call__(self):
        for infilename in glob.glob('*.xml'):
            tree = etree.parse(infilename)
            log.info('Opening file %s.', infilename)
            root = tree.getroot()
            for perms in root.xpath(".//property[@name='permissions']"):
                if not perms.xpath("element[@value='Edit']"):
                    e = etree.SubElement(perms, 'element')
                    e.attrib['value'] = 'Edit'
                    log.info('Patched perms: %s.', str(perms))
            outfilename = 'patched-%s' % infilename
            fout = file(outfilename, 'w')
            fout.write(etree.tostring(root, pretty_print=True))
            log.info('Wrote file %s.', outfilename)

Now, we can make a script that uses this class in a verbose (all logged info) mode or a quieter (logged errors only) mode:

# Script
if __name__ == '__main__':

    verbose = False

    # Get command line options and args
    opts, args = getopt.getopt(sys.argv[1:], 'v')
    for o, a in opts:
        if o == '-v':
            verbose = True

    # Get the file patching logger
    log = logging.getLogger('file-patcher')

    # Logging configuration, to stdout in this case
    console = logging.StreamHandler()
    log.addHandler(console)

    if verbose:
        log.setLevel(logging.INFO)
    else:
        log.setLevel(logging.ERROR)

    # Process data
    patch_files = FilePatcher()
    patch_files()

Duke University's GeoEco project has more examples of professional logging, many in an ArcGIS scripting context.

Comments

Re: Python logging

Author: timmie

HI! Thanks for the post. I recently discovered that this module exists but not yet managed to use it. Particularly because I still use man "prints" for fast debugging on command line. I really do not know how to get around this. I am not a full time software developer and professional debugging/testing seems to be a lot of overhead at my current stage of knowledge. > I'm inspired to post more of these, and more that are useful to the ESRI ArcGIS scripting user. I would be enlighted to see more of these posts. As outlines above, gentle introductions to good coding practices can help a lot: * Fast, pragmatic debugging * Using tests in development of scientific / geospatial scripts. Kind regards.

Re: Python logging

Author: oblivion

I like to do this:

    import logging
    log = logging.getLogger(__name__)

Then, log messages are always identified by their sub-packages of origin, and they can even be filtered on that basis. Also, if you do this:

    log.info('Opening file %s in mode %s.', infilename, mode)

then the runtime penalty is tiny, since the string substitution does not occur if the log record is filtered, but I still prefer to do this:

    log.info('Opening file %s in mode %s.' %(infilename, mode))

The substitution definitely occurs only once, no matter how many handlers exist. I simply avoid logging anything in runtime-critical sections of code.

Re: Python logging

Author: Allan Doyle

My contribution (works on Python 2.5, haven't done much Python hacking lately):

#! /usr/bin/env python

"""
debugging - inspired by the debugging.h and error_handling.c modules
            of OpenMap's C-based "toollib" that I originally wrote
            in the 90's while at BBN.


Debug([word])

        Looks in the current environment (via getenv) to look for
        an environment variable called 'DEBUG'.

        If there is none, then it returns False
        If there is one, and it was called with an argument, then
        if the DEBUG variable contains that argument, it returns True
        otherwise, it returns false. Note that it does this with
        'word' boundaries.

        Examples:

        Assume no DEBUG environment variable is set
                Debug() --> False
                Debug("foo") --> False
                Debug("bar") --> False

        Assume DEBUG is set to "foo foobar"

                Debug() --> True
                Debug("foo") --> True
                Debug("bar") --> False

DebugSet([word])

        Allows you to add strings to the set that are in the DEBUG
        environment variable.

        Examples:

        Assume DEBUG is set to "foo foobar"
                DebugSet("bar")
                Debug("bar") now returns True

DebugUnset([word])

        If called with no arguments, it makes all subsequent calls to Debug
        return False

        If called with a word, it makes all subsequent calls to Debug with
        that word return False

DebugMessage(message, [level])

        message is a string
        level is one of "CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG"

        Prints the message to stderr, along with date, time, level, and
        source file name and line number where the call was made.
"""

# ------------------------------------------------------------------------
#
# Copyright (c) 2006 Allan Doyle
#
#  Permission is hereby granted, free of charge, to any person
#  obtaining a copy of this software and associated documentation
#  files (the "Software"), to deal in the Software without
#  restriction, including without limitation the rights to use, copy,
#  modify, merge, publish, distribute, sublicense, and/or sell copies
#  of the Software, and to permit persons to whom the Software is
#  furnished to do so, subject to the following conditions:
#
#  The above copyright notice and this permission notice shall be
#  included in all copies or substantial portions of the Software.
#
#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
#  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
#  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
#  NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
#  HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
#  WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
#  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
#  DEALINGS IN THE SOFTWARE.
#
# ------------------------------------------------------------------------


import inspect
import logging
import os
import os.path
import sys

## Set up the Debug part of this code...

__vars = {}
__debug = os.getenv("DEBUG")

if __debug:
    __vars[' '] = ' '                   # we use ' ' as a marker for DEBUG
                                        # being set, even if it has no content

    try:
        # each separate string in DEBUG gets used as a key in a dictionary
        # later on, this makes it easy to tell if that string was set
        for v in os.getenv("DEBUG").split():
            __vars[v] = v

    except AttributeError:
        pass
    except TypeError:
        pass

def Debug(name=" "):
    """
    Checks to see if the "DEBUG" environment variable is set, if
    an optional string is passed in, if that string is set in the
    "DEBUG" environment variable.
    """
    global __vars
    return __vars.has_key(name)

def DebugSet(name=" "):
    """
    If called with no argument, causes subsequent calls to Debug() to
    return True. If called with an argument, causes subsequent calls
    to Debug() with the same string to return True
    """

    global __vars
    global __debug

    __debug = True
    __vars[name] = name
    return(True)

def DebugUnset(name=" "):
    """
    If called with no argument, causes subsequent calls to Debug() to
    return False. If called with an argument, causes subsequent calls
    to Debug() with the same string to return False.
    """
    global __vars
    global __debug

    if name == " ":
        __debug = False
        __vars = {}
        return

    try:
        del __vars[name]

    except KeyError:
        pass

    return

## Set up the DebugMessage part of this. We use the logging module,
## but send everything to stderr. Someday, this could get expanded
## to also have a logfile, etc.

logging.basicConfig(level=logging.DEBUG,
                    format='%(levelname)-8s %(message)s',
                    stream=sys.stderr)


levels = {'CRITICAL':logging.critical,
          'ERROR':logging.error,
          'WARNING':logging.warning,
          'INFO':logging.info,
          'DEBUG':logging.debug}


def DebugMessage(msg, level="DEBUG"):
    """
    Produces nicely formatted output to stderr.
    If called with the wrong level, it complains, and behaves
    as though the level was "ERROR"
    If called with no level, it uses "DEBUG"
    The allowed levels are
      "CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG"
    """

    current = inspect.currentframe()
    outer = inspect.getouterframes(current)

    try:
        levels[level]('%s - %4d: %s' %
                      (os.path.basename(outer[1][1]),
                       outer[1][2],
                       msg))

    except KeyError:
        DebugMessage('DebugMessage() called with unknown level: %s' % level)
        logging.error('%s - %4d: %s' %
                      (os.path.basename(outer[1][1]),
                       outer[1][2],
                       msg))
    del outer
    del current

__version__ = '$Id$'
if Debug('version'): print __version__

Then, to use it:


# Grab some of the debugging stuff...
from debugging import Debug, DebugSet, DebugMessage

# Put this in every module. Later on, you can 'setenv DEBUG version'
# and your code will print out the version of every module when it runs
__version__ = '$Id$'
if Debug('version'): print __version__

# Now you can do stuff like

   if Debug("time"): print "a. %.2f %s" % (time.time()-self.start, path)

# or
   if of.GetLayerCount() == 0:
        DebugMessage("No layers in %s" % filename)

# or
   if Debug("filename") or Debug("verbose"): print "fileinfo",  filename

Re: Python logging

Author: Sean

Wow, my blog comments do not render code well at all. Good tips, oblivion and Allan. Testing and debugging are big topics. To get started with debugging Python, you could a lot worse than to add

import pdb; pdb.set_trace()

before a troubled section of code. Run your script and set_trace() gives you a Python prompt at that point. From the prompt you can inspect all objects in scope, step through statements, and go up (and back down) the stack of execution frames. Jeremy Jones's pdb tutorial explains all you need to begin.

Re: Python logging

Author: Kyle VanderBeek

I generally recommend against creating logger objects at load time (which will happen due to where your library's getLogger() call is). I do it in the __init__ of an object, and use both the module name and object name together to former a proper category heirarchy:

def __init__(self):
    self._log = logging.getLogger('%s.%s' % (__name__, self.__class__.__name__))

This idiom gives you a dot-separated hierarchy of logger categories, allowing you to configure filters and handlers to send different categories to different destinations, mute some module's log messages, or turn up and down levels in a more granular fashion. Lastly, you're working too hard configuring the log. Look at logging.basicConfig(). And, actually, you should be configuring the root logger (by calling logging.getLogger() with no arguments), not one named with a particular category, in your main program. This will become important as your project grows to multiple modules.

Re: Python logging

Author: Sean

Great tips, Kyle. Thanks.

Will the real "GeoWeb" please stand up, part 2

Sean Gillies

2008-11-06 00:00

One of the "GeoWebs" is predicated on GML, another eschews it. There's some brand confusion here.

Comments

Re: Will the real GeoWeb please stand up, part 2

Author: mpd

Splitters!

Re: Will the real GeoWeb please stand up, part 2

Author: Sean

Just realized I'd forgotten the scare quotes around "GeoWeb".