Debugging temporary files using pytest autouse fixtures
This week I discovered that Rasterio doesn't always close the temporary in-memory datasets that are used within some of its methods. In testing Rasterio's WarpedVRT class I used a GDAL function to dump descriptions of all open datasets and found a bunch that looked unrelated to WarpedVRT. They were GDAL "MEM" type datasets with UUIDs for names, which didn't tell me much. What were their origins?
They have UUIDs for names because Rasterio imports uuid in its _io module and
calls uuid.uuid4()
to make temporary dataset names. If only the dataset
name included the name of the test in which it was created, then I'd have an
entry point into debugging. One way to do this is with a pytest auto-used
fixture.
I changed the rasterio._io module's import statement from import uuid
to
from uuid import uuid4
to make it slightly easier to monkey patch and then
I added 5 lines of code to Rasterio's conftest.py
file:
@pytest.fixture(autouse=True) def set_mem_name(request, monkeypatch): def youyoueyedeefour(): return "{}-{}".format(request.node.name, uuid.uuid4()) monkeypatch.setattr(rasterio._io, "uuid4", youyoueyedeefour)
This set_mem_name
fixture uses two standard pytest fixtures: request
and monkeypatch
. The value of request.node.name
is the name of the test
and this set_mem_name
fixture uses monkeypatch
to replace uuid4
in
rasterio._io
with a custom function that prepends the name of the test to
the UUID. The autouse=True
argument tells pytest to add this fixture to every
test it finds. I didn't need to touch the code of any of Rasterio's tests, not a one.
This quickly revealed to me that the unclosed temporary datasets were coming from tests that asserted certain exceptions were being raised by Rasterio's reprojection code. This code used temporary datasets and didn't close them before raising the exception to the caller. Once I changed the code to do the following, Rasterio no longer leaked datasets from those tests, or in our programs.
If Rasterio used only Python's unittest module, and not pytest, it would be
possible to do the same thing. Import rasterio._io in the test case's
setUp()
, monkey patch it, and then restore it in tearDown()
. If all the
tests derived from one base class, it would only be necessary to extend that
class. The unittest.mock module easily allows every test to be patched with
a single decorator statement.
It seems like it could be two fewer lines of code, but I don't immediately see
how to get the name of the test and use it with only the mock.patch decorator.
It looks like one would have to use a patcher's start and stop,
which is back to somewhat more boilerplate than with pytest.