Python logging

Sean Gillies

2008-11-11 00:00

Update (2009-09-29): see http://plumberjack.blogspot.com/2009/09/python-logging-101.html

A friend at the local Park Service office mentioned to me that he's become a Python user and has been looking at some of my published code examples. I'm inspired to post more of these, and more that are useful to the ESRI ArcGIS scripting user. Python serves the ESRI community as a replacement for Avenue, but is a far more rich language and platform. Today's tip: excellent application logging and how to avoid using print statements in your production code.

I tripped myself using print statements recently, and have seen it in the code of others. While not a mortal sin, it does reduce the code's potential for reuse. A print statement might be tolerable when you're running the code at a shell prompt, but what if you wanted to execute the same code in a service? Better not to fill your service logs (or wherever stdout goes) with unnecessary debugging noise, yes? Here's where Python's logging module steps in to help: your information-processing classes write messages to a logging interface, any Python application then configures the logger appropriately. Log messages can be sent to single files on disk, rotating log files, sent to SMTP servers -- you could probably even create your own XMPP handler.

Say, for example, you have a class that patches elements into XML files (inspired by Dave Bouwman's post). Using a Python logger instead of a print statement requires only 2 extra lines of code (zero if you use the root logger):

import getopt
import glob
import logging
import sys
from lxml import etree

# Create a logger
log = logging.getLogger('file-patcher')

# Example of reusable code
class FilePatcher(object):
    """
    Patch edit permissions into an XML configuration file and write new
    files to disk.
    """
    def __call__(self):
        for infilename in glob.glob('*.xml'):
            tree = etree.parse(infilename)
            log.info('Opening file %s.', infilename)
            root = tree.getroot()
            for perms in root.xpath(".//property[@name='permissions']"):
                if not perms.xpath("element[@value='Edit']"):
                    e = etree.SubElement(perms, 'element')
                    e.attrib['value'] = 'Edit'
                    log.info('Patched perms: %s.', str(perms))
            outfilename = 'patched-%s' % infilename
            fout = file(outfilename, 'w')
            fout.write(etree.tostring(root, pretty_print=True))
            log.info('Wrote file %s.', outfilename)

Now, we can make a script that uses this class in a verbose (all logged info) mode or a quieter (logged errors only) mode:

# Script
if __name__ == '__main__':

    verbose = False

    # Get command line options and args
    opts, args = getopt.getopt(sys.argv[1:], 'v')
    for o, a in opts:
        if o == '-v':
            verbose = True

    # Get the file patching logger
    log = logging.getLogger('file-patcher')

    # Logging configuration, to stdout in this case
    console = logging.StreamHandler()
    log.addHandler(console)

    if verbose:
        log.setLevel(logging.INFO)
    else:
        log.setLevel(logging.ERROR)

    # Process data
    patch_files = FilePatcher()
    patch_files()

Duke University's GeoEco project has more examples of professional logging, many in an ArcGIS scripting context.

Comments

Re: Python logging

Author: timmie

HI! Thanks for the post. I recently discovered that this module exists but not yet managed to use it. Particularly because I still use man "prints" for fast debugging on command line. I really do not know how to get around this. I am not a full time software developer and professional debugging/testing seems to be a lot of overhead at my current stage of knowledge. > I'm inspired to post more of these, and more that are useful to the ESRI ArcGIS scripting user. I would be enlighted to see more of these posts. As outlines above, gentle introductions to good coding practices can help a lot: * Fast, pragmatic debugging * Using tests in development of scientific / geospatial scripts. Kind regards.

Re: Python logging

Author: oblivion

I like to do this:

    import logging
    log = logging.getLogger(__name__)

Then, log messages are always identified by their sub-packages of origin, and they can even be filtered on that basis. Also, if you do this:

    log.info('Opening file %s in mode %s.', infilename, mode)

then the runtime penalty is tiny, since the string substitution does not occur if the log record is filtered, but I still prefer to do this:

    log.info('Opening file %s in mode %s.' %(infilename, mode))

The substitution definitely occurs only once, no matter how many handlers exist. I simply avoid logging anything in runtime-critical sections of code.

Re: Python logging

Author: Allan Doyle

My contribution (works on Python 2.5, haven't done much Python hacking lately):

#! /usr/bin/env python

"""
debugging - inspired by the debugging.h and error_handling.c modules
            of OpenMap's C-based "toollib" that I originally wrote
            in the 90's while at BBN.


Debug([word])

        Looks in the current environment (via getenv) to look for
        an environment variable called 'DEBUG'.

        If there is none, then it returns False
        If there is one, and it was called with an argument, then
        if the DEBUG variable contains that argument, it returns True
        otherwise, it returns false. Note that it does this with
        'word' boundaries.

        Examples:

        Assume no DEBUG environment variable is set
                Debug() --> False
                Debug("foo") --> False
                Debug("bar") --> False

        Assume DEBUG is set to "foo foobar"

                Debug() --> True
                Debug("foo") --> True
                Debug("bar") --> False

DebugSet([word])

        Allows you to add strings to the set that are in the DEBUG
        environment variable.

        Examples:

        Assume DEBUG is set to "foo foobar"
                DebugSet("bar")
                Debug("bar") now returns True

DebugUnset([word])

        If called with no arguments, it makes all subsequent calls to Debug
        return False

        If called with a word, it makes all subsequent calls to Debug with
        that word return False

DebugMessage(message, [level])

        message is a string
        level is one of "CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG"

        Prints the message to stderr, along with date, time, level, and
        source file name and line number where the call was made.
"""

# ------------------------------------------------------------------------
#
# Copyright (c) 2006 Allan Doyle
#
#  Permission is hereby granted, free of charge, to any person
#  obtaining a copy of this software and associated documentation
#  files (the "Software"), to deal in the Software without
#  restriction, including without limitation the rights to use, copy,
#  modify, merge, publish, distribute, sublicense, and/or sell copies
#  of the Software, and to permit persons to whom the Software is
#  furnished to do so, subject to the following conditions:
#
#  The above copyright notice and this permission notice shall be
#  included in all copies or substantial portions of the Software.
#
#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
#  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
#  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
#  NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
#  HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
#  WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
#  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
#  DEALINGS IN THE SOFTWARE.
#
# ------------------------------------------------------------------------


import inspect
import logging
import os
import os.path
import sys

## Set up the Debug part of this code...

__vars = {}
__debug = os.getenv("DEBUG")

if __debug:
    __vars[' '] = ' '                   # we use ' ' as a marker for DEBUG
                                        # being set, even if it has no content

    try:
        # each separate string in DEBUG gets used as a key in a dictionary
        # later on, this makes it easy to tell if that string was set
        for v in os.getenv("DEBUG").split():
            __vars[v] = v

    except AttributeError:
        pass
    except TypeError:
        pass

def Debug(name=" "):
    """
    Checks to see if the "DEBUG" environment variable is set, if
    an optional string is passed in, if that string is set in the
    "DEBUG" environment variable.
    """
    global __vars
    return __vars.has_key(name)

def DebugSet(name=" "):
    """
    If called with no argument, causes subsequent calls to Debug() to
    return True. If called with an argument, causes subsequent calls
    to Debug() with the same string to return True
    """

    global __vars
    global __debug

    __debug = True
    __vars[name] = name
    return(True)

def DebugUnset(name=" "):
    """
    If called with no argument, causes subsequent calls to Debug() to
    return False. If called with an argument, causes subsequent calls
    to Debug() with the same string to return False.
    """
    global __vars
    global __debug

    if name == " ":
        __debug = False
        __vars = {}
        return

    try:
        del __vars[name]

    except KeyError:
        pass

    return

## Set up the DebugMessage part of this. We use the logging module,
## but send everything to stderr. Someday, this could get expanded
## to also have a logfile, etc.

logging.basicConfig(level=logging.DEBUG,
                    format='%(levelname)-8s %(message)s',
                    stream=sys.stderr)


levels = {'CRITICAL':logging.critical,
          'ERROR':logging.error,
          'WARNING':logging.warning,
          'INFO':logging.info,
          'DEBUG':logging.debug}


def DebugMessage(msg, level="DEBUG"):
    """
    Produces nicely formatted output to stderr.
    If called with the wrong level, it complains, and behaves
    as though the level was "ERROR"
    If called with no level, it uses "DEBUG"
    The allowed levels are
      "CRITICAL", "ERROR", "WARNING", "INFO", "DEBUG"
    """

    current = inspect.currentframe()
    outer = inspect.getouterframes(current)

    try:
        levels[level]('%s - %4d: %s' %
                      (os.path.basename(outer[1][1]),
                       outer[1][2],
                       msg))

    except KeyError:
        DebugMessage('DebugMessage() called with unknown level: %s' % level)
        logging.error('%s - %4d: %s' %
                      (os.path.basename(outer[1][1]),
                       outer[1][2],
                       msg))
    del outer
    del current

__version__ = '$Id$'
if Debug('version'): print __version__

Then, to use it:


# Grab some of the debugging stuff...
from debugging import Debug, DebugSet, DebugMessage

# Put this in every module. Later on, you can 'setenv DEBUG version'
# and your code will print out the version of every module when it runs
__version__ = '$Id$'
if Debug('version'): print __version__

# Now you can do stuff like

   if Debug("time"): print "a. %.2f %s" % (time.time()-self.start, path)

# or
   if of.GetLayerCount() == 0:
        DebugMessage("No layers in %s" % filename)

# or
   if Debug("filename") or Debug("verbose"): print "fileinfo",  filename

Re: Python logging

Author: Sean

Wow, my blog comments do not render code well at all. Good tips, oblivion and Allan. Testing and debugging are big topics. To get started with debugging Python, you could a lot worse than to add

import pdb; pdb.set_trace()

before a troubled section of code. Run your script and set_trace() gives you a Python prompt at that point. From the prompt you can inspect all objects in scope, step through statements, and go up (and back down) the stack of execution frames. Jeremy Jones's pdb tutorial explains all you need to begin.

Re: Python logging

Author: Kyle VanderBeek

I generally recommend against creating logger objects at load time (which will happen due to where your library's getLogger() call is). I do it in the __init__ of an object, and use both the module name and object name together to former a proper category heirarchy:

def __init__(self):
    self._log = logging.getLogger('%s.%s' % (__name__, self.__class__.__name__))

This idiom gives you a dot-separated hierarchy of logger categories, allowing you to configure filters and handlers to send different categories to different destinations, mute some module's log messages, or turn up and down levels in a more granular fashion. Lastly, you're working too hard configuring the log. Look at logging.basicConfig(). And, actually, you should be configuring the root logger (by calling logging.getLogger() with no arguments), not one named with a particular category, in your main program. This will become important as your project grows to multiple modules.

Re: Python logging

Author: Sean

Great tips, Kyle. Thanks.