Python Pain Points

So, there's this meme that you have to be able to enumerate your domain's pain points. For Python and GIS/neogeography these include:

  • the standard library's xml.sax and xml.dom. ElementTree and lxml are the way to go.

  • unittest. I wrote a pile of unittest modules for MapServer's SWIG bindings, aka mapscript. So many that it'll be a while before I slide out of the top 5 all-time MapServer LOC. The tests never really caught on with other developers. The generic xUnit style didn't help seasoned developers cross over, and doesn't help explain usage to new users. These days, I'm much happier using doctest. It's been hard for new Python programmers to find good advice on the right testing framework, and the consequence for open source Python GIS is that people don't write tests.

  • distutils and setuptools. Python's packaging and distribution story is too arcane. Yes, this is intricate stuff, but it needs to be easier. People developing open source GIS for Python rarely bother to figure out the old distutils setup, let alone eggs.

  • str and unicode. Python's unicode story is not bad, and will be even better when unicode is always on.

  • function names. This is a small itch (caused by exposure to Ruby), but I'd love to be able to use self-descriptive Python APIs like:

    >>> if polygon.simple?      # instead of isSimple()
    >>> polygon.rotate!(angle)  # rotate self, not a copy
  • mapscript.py and gdal.py. Many users come to Python via MapServer and GDAL, which is unfortunate because those applications have terrible, un-Pythonic APIs (that ripple through the open source GIS community) and have created the impression that Python is just a binding for C/C++ code.

Comments

Re: Python Pain Points

Author: hobu

MapServer's unittests have slowly gained traction, but recently both myself and Umberto have made commits to them to help with watching the 5.0 release. I agree that doctest is much better and it would be nice to port MapServer's tests to it someday. Re: eggs. Both GDAL and MapServer will use setuptools first if it is available (although current trunk of GDAL will do --single-version-externally-managed by default if you use the 'make install' target). I fricking hate eggs though. In a web-enabled world, having eggs not extract themselves by default in someplace like /tmp that is writeable to processes like Apache causes much pain. Python eggs in their current incarnation seem like an obscene hack to me, with their behind the scenes path munging, unzipping, and other magic. That setuptools also busts compatibility with distutils also causes pain (ie, I can't do --prefix with setuptools but can with distutils). What did setuptools really solve that distutils didn't already again? Python may be fully unicode aware someday soon, but the C/C++ libs that we depend on in the Open Source GIS domain won't be. re unsightliness of gdal.py and mapscript.py... The first one out the gate gets shot in the back :) Do we change it now to make it pretty and break hundreds of thousands of lines of code and piss everyone off who's currently using it?

Re: Python Pain Points

Author: Sean

I like eggs. The development setup target is very handy, and I'm even planning to make more use of the dependency resolution before my projects hit 1.0. Those are my favorite features not found in distutils. I don't have a solution for my mapscript and ogr pain other than to improve Shapely, Rtree, and PCL, and get them to 1.0.