Saving bandwidth and more using httplib2

Here's a comment that more properly belongs on Saving bandwidth using Python. The requirement to register was (to me) a blocker for leaving it there. Anne writes:

When Internet connection is a limited resource, a well-designed website doesn’t perform multiple times the same request. This little adjustment can significantly reduce the time required to load and refresh a page. First-world programmers should keep this in mind, or better come to South Africa and experience it in person ...

The solution involves a wrapper around urllib.urlretrieve that partially implements HTTP caching. A more robust solution might instead use the almost transparent Last-Modified or ETag validated caching that is built into httplib2. See also Mark Pilgrim's notes on httplib2 in Dive Into Python 3 (httplib2 works fine with Python 2.3+). Saves bandwidth, development time, and bug chasing.

Comments

Re: Saving bandwidth and more using httplib2

Author: Barry Rowlingson

Or set up a local, or institutional, caching proxy using squid and apache? Most programs these days can be configured to use them, either via setting http_proxy as an environment variable or custom config settings.

Then you get to set timeouts and E-tag magic and it can work for everyone on your PC/at your institution for all HTTP content. Big wins all round.

Re: Saving bandwidth and more using httplib2

Author: Sean

Better yet, I agree, for an enterprise. I assumed that the Linfinti work had a smaller scope.

Re: Saving bandwidth and more using httplib2

Author: Tim Sutton

Actually Anne left out a little bit of the story - we are writing a web mapping client that builds a legend using wms GetLegendGraphic requests to a third party service. So the client is on one (unknown) network where we cant impose squid etc, the server on another and the wms service on a 3rd. The problem is that each time the client requests a page that has a legend on it, they wait for time consuming getlegendgraphic requests. So our solution is to cache the legend graphic requests on our server so that we can give a known response time to the web client. In this context squid / proxying requests wouldnt be an option.

Regards

Tim