I've updated Mush to use my feedparser.py enhancements and Shapely 1.0a3. Now it will parse GeoRSS GML, Simple, and W3C geometries of all types (points, lines, polygons) from source feeds. For example, here are the last 10 entries from Christopher Schmidt's FeatureServer demo, pulled through the self-intersection processing resource: feed, map.
Please note that, in the interest of conserving resources and minimizing response times, I've limited the number of entries that Mush will read from any feed to 42.
This work reminds me to comment on Andrew Turner's recent post on security issues around feed aggregation. He writes:
The onus of security is on the application or aggregator that pulled the feed on behalf of the authorized user. But at the same time once the feed has been retrieved, there is no storage of the authorization credentials with the feed itself. It has essentially been stripped of itâs shell of potential privacy and looking at the feed itself you would have no idea if it was supposed to be kept private, and visible only to certain, unknown persons.
What would be nice would be a mechanism to store at least references to permissions and authorization credentials within the feed itself. That way if an application still has the feed, or wishes to store it and re-aggregate it, they can apply the same authorization as the feed originally had.
There's another big issue that Andrew doesn't mention (discussed by Richardson and Ruby in chapter 6 of "RESTful Web Services"): how does the aggregator pass along the user's credentials without caching (with risk of theft) them? Mush doesn't intend to solve this problem at all. I think the onus of privacy remains largely on the original content provider. If you want to make a feed for authorized content, you should strip that feed down to the bare minimum and provide https hrefs to the content itself. If the feed metadata must also remain private, you can encrypt specific elements or even the entire feed.
Finally, feeds should be cached for no more than the duration specified by their origin servers. A feed is just a representation of entities that "live" on the Web, and applications should be pulling new representations from the web rather than relying on silos. Storing feeds indefinitely -- treating GeoRSS like shapefiles -- breaks the Web.
Comments
Re: Shapely Manual
Author: Christopher Schmidt
Any reason you chose a different license for the docs than for Shapely itself? Apparently the debian-legal list consider CC licenses to be 'non-free' under the DFSG (!!), though the 3.0 licenses have made it into Debian in the past, so perhaps paying attention to debian-legal in this case is just the wrong direction for actually getting things done. If you don't mind, I'd be willing to package up Shapely for Debian once you release a 1.0 and maintain it there going forward if I can find a sponsor. In any case, it feels weird to have the docs available under a different license than the software. The manual looks great, btw. You might want to ask Howard about adding RST support to your trac instance? I think there's a way to make the docs automatically turn into RST HTML in the browser view... Though it looks like you might know that, so ignore this if it doesn't make sense.Re: Shapely Manual
Author: Sean
You may be right about the manual license. I've been proposing to change Shapely to BSD already. The combination of GEOS Debian packages and PyPI Shapely packages is already more than sufficient (for me, at least), but a Shapely Debian package would be welcome.Re: Shapely Manual
Author: Christopher Schmidt
Er, I thought Shapely was under BSD? "I've been proposing to change Shapely to BSD already." doens't make sense, since PyPI and LICENSE.txt both say BSD...Re: Shapely Manual
Author: Sean
I responded thoughtlessly. Shapely has been BSD licensed since November.