Python logging
Update (2009-09-29): see http://plumberjack.blogspot.com/2009/09/python-logging-101.html
A friend at the local Park Service office mentioned to me that he's become a Python user and has been looking at some of my published code examples. I'm inspired to post more of these, and more that are useful to the ESRI ArcGIS scripting user. Python serves the ESRI community as a replacement for Avenue, but is a far more rich language and platform. Today's tip: excellent application logging and how to avoid using print statements in your production code.
I tripped myself using print statements recently, and have seen it in the code of others. While not a mortal sin, it does reduce the code's potential for reuse. A print statement might be tolerable when you're running the code at a shell prompt, but what if you wanted to execute the same code in a service? Better not to fill your service logs (or wherever stdout goes) with unnecessary debugging noise, yes? Here's where Python's logging module steps in to help: your information-processing classes write messages to a logging interface, any Python application then configures the logger appropriately. Log messages can be sent to single files on disk, rotating log files, sent to SMTP servers -- you could probably even create your own XMPP handler.
Say, for example, you have a class that patches elements into XML files (inspired by Dave Bouwman's post). Using a Python logger instead of a print statement requires only 2 extra lines of code (zero if you use the root logger):
import getopt import glob import logging import sys from lxml import etree # Create a logger log = logging.getLogger('file-patcher') # Example of reusable code class FilePatcher(object): """ Patch edit permissions into an XML configuration file and write new files to disk. """ def __call__(self): for infilename in glob.glob('*.xml'): tree = etree.parse(infilename) log.info('Opening file %s.', infilename) root = tree.getroot() for perms in root.xpath(".//property[@name='permissions']"): if not perms.xpath("element[@value='Edit']"): e = etree.SubElement(perms, 'element') e.attrib['value'] = 'Edit' log.info('Patched perms: %s.', str(perms)) outfilename = 'patched-%s' % infilename fout = file(outfilename, 'w') fout.write(etree.tostring(root, pretty_print=True)) log.info('Wrote file %s.', outfilename)
Now, we can make a script that uses this class in a verbose (all logged info) mode or a quieter (logged errors only) mode:
# Script if __name__ == '__main__': verbose = False # Get command line options and args opts, args = getopt.getopt(sys.argv[1:], 'v') for o, a in opts: if o == '-v': verbose = True # Get the file patching logger log = logging.getLogger('file-patcher') # Logging configuration, to stdout in this case console = logging.StreamHandler() log.addHandler(console) if verbose: log.setLevel(logging.INFO) else: log.setLevel(logging.ERROR) # Process data patch_files = FilePatcher() patch_files()
Duke University's GeoEco project has more examples of professional logging, many in an ArcGIS scripting context.
Comments
Re: Python logging
Author: timmie
HI! Thanks for the post. I recently discovered that this module exists but not yet managed to use it. Particularly because I still use man "prints" for fast debugging on command line. I really do not know how to get around this. I am not a full time software developer and professional debugging/testing seems to be a lot of overhead at my current stage of knowledge. > I'm inspired to post more of these, and more that are useful to the ESRI ArcGIS scripting user. I would be enlighted to see more of these posts. As outlines above, gentle introductions to good coding practices can help a lot: * Fast, pragmatic debugging * Using tests in development of scientific / geospatial scripts. Kind regards.Re: Python logging
Author: oblivion
I like to do this: Then, log messages are always identified by their sub-packages of origin, and they can even be filtered on that basis. Also, if you do this: then the runtime penalty is tiny, since the string substitution does not occur if the log record is filtered, but I still prefer to do this: The substitution definitely occurs only once, no matter how many handlers exist. I simply avoid logging anything in runtime-critical sections of code.Re: Python logging
Author: Allan Doyle
My contribution (works on Python 2.5, haven't done much Python hacking lately): Then, to use it:Re: Python logging
Author: Sean
Wow, my blog comments do not render code well at all. Good tips, oblivion and Allan. Testing and debugging are big topics. To get started with debugging Python, you could a lot worse than to add before a troubled section of code. Run your script and set_trace() gives you a Python prompt at that point. From the prompt you can inspect all objects in scope, step through statements, and go up (and back down) the stack of execution frames. Jeremy Jones's pdb tutorial explains all you need to begin.Re: Python logging
Author: Kyle VanderBeek
I generally recommend against creating logger objects at load time (which will happen due to where your library's getLogger() call is). I do it in the __init__ of an object, and use both the module name and object name together to former a proper category heirarchy: This idiom gives you a dot-separated hierarchy of logger categories, allowing you to configure filters and handlers to send different categories to different destinations, mute some module's log messages, or turn up and down levels in a more granular fashion. Lastly, you're working too hard configuring the log. Look at logging.basicConfig(). And, actually, you should be configuring the root logger (by calling logging.getLogger() with no arguments), not one named with a particular category, in your main program. This will become important as your project grows to multiple modules.Re: Python logging
Author: Sean
Great tips, Kyle. Thanks.