One of the issues that's complicated all-out integration of Zope/Plone and GIS is
the lack of spatial indexes for catalogs. Without an index, your data is bottled
up in the ZODB. "Find all features within [bbox]" is O(N), and unfeasible for
large sets of features. Pleiades needs to overcome this issue soon, and so I've
done something about it.
After researching trees of all kinds and reviewing a number of libraries, I finally
went with what had been right under my nose all along: the quadtree from the
venerable and inscrutable shapelib. A few years ago I would have used SWIG to wrap
shapelib's tree, but I honestly don't give a damn about anything but Python these
days. I whipped up tests, a Python C extension module, and Quadtree was born.
How does it work? Check out the doctests:
Make an instance
>>> from quadtree import Quadtree
>>> index = Quadtree([-180, -90, 180, 90])
Add 3 items
>>> index.add(0, [-106, 39, -105, 40])
>>> index.add(1, (-100, 25, -95, 30))
>>> index.add(2, (40, 50, 45, 55))
Find likely items
>>> [n for n in index.likely_intersection([-110, 30, -100, 45])]
[0, 1]
>>> [n for n in index.likely_intersection([10, 40, 40.05, 60])]
[2]
We get a hit for the next bounding box even though it doesn't explicitly
intersect with the item -- it does intersect with the tree node containing
the item.
>>> [n for n in index.likely_intersection([-110, 35, -108, 45])]
[0]
How well does it work? I've yet to script runs over the entire parameter
space (data density, tree extent, query box size, maximum tree depth), but
have preliminary results from tests/benchmarks.py. 1000 points are
randomly sprinkled in a box bounded by [0, 0, 1, 1]. The "brute force"
method of determining whether a point is inside a box is:
p.x >= minx and p.x <= maxx and p.y >= miny and p.y <= maxy
A typical result, averaged over 100 passes:
1000 points
Index extent: [-180, -90, 180, 90]
Query box: [0.2, 0.3, 0.3, 0.4]
Brute Force:
15 hits
1539.89 usec/pass
Likely Intersection:
23 hits
42.03 usec/pass
Likely Intersection with false positive mop-up:
15 hits
115.35 usec/pass
Making one extra swipe over the likely points in order to mop up
the 8 false positives is costly, but
is still much faster than comparing each point to the query
box. I knew it would be, but it's always encouraging to see the
numbers.
Speaking of the way I used to do things ... as recently as May I
would have rolled this into the Python Cartographic Library, but have
seen the light since then. There's been more interest in OWSLib since I dredged
it up from PCL. I'm also inspired by geopy, the single-minded library
for accessing geocoding services. Quadtree will be similarly single-minded.
It depends on nothing except setuptools (for now). After a little more testing,
I'll upload to Cheese Shop.
If it looks useful to you, grab via Subversion:
svn co http://icon.stoa.org/svn/pleiades/Quadtree/trunk Quadtree
and take it for a ride.
P.S. -- five quatloos for the first person to find the Neal Stephenson reference in this post.
Comments
Re: Why Does WFS Dislike the Web?
Author: Tom Kralidis
Interesting; a few comments: If someone were to use the native HTTP methods, then that wouldn't that have stuck things more in the hands of the httpd server for configuration? The OGC way, it's easier to decouple and add an auth layer between httpd and the application software (or within the app software). Mind you, it's work either way. Also, HTTP status codes are totally different things then ExceptionReport codes. An ExceptionReport still comes back as HTTP 200 as it should, i.e. the HTTP response was successful (as a protocol), but the service returns error.Re: Why Does WFS Dislike the Web?
Author: Allan Doyle
Historical precedent. WMS was developed in 1999, before we had heard of REST. (Note - Fielding's dissertation is dated 2000). WFS was patterened after WMS. I first bumped into REST in 2000 during the 2nd OGC testbed and as much as things made sense to me, it wasn't clear at the time how to turn the momentum towards REST, particularly in the face of considerable pro-SOAP interest. There are certainly things I wish we had done differently for WMS. In particular, I think having a CGI interface to the capabilities document has proven to be a huge impediment to getting people to think about search engine friendly ways of advertising WMS's.Re: Why Does WFS Dislike the Web?
Author: Sean
Tom: yeah, my status code idea was naive. Certainly some clients are counting on a 200 along with their service exception. I think the protocol is wrong here. They *should* expect a 400 for a bad request, or a 500 for a WFS server error (db failure or whatever). Allan: REST has been around longer than that. The intro to Fielding's dissertation states that REST has been guiding the architecture of the web "over the past six years". Despite that, it's been easy to overlook. I totally agree with you about the problem with GetCapabilities. Thanks for the comments!Re: Why Does WFS Dislike the Web?
Author: Panagiotis (Peter) A. Vretanos
Allan: with regard to making OGC services more "search engine" friendly, I think that the upcoming version of OWS common mandates that OGC services generate some descriptive metadata when they are invoked without any parameters. Is this what you had in mind for making OGC service more search engine friendly.Re: Why Does WFS Dislike the Web?
Author: Allan Doyle
Comment to myself... maybe it was actually the 3rd testbed. It's all a blur :) Sean - yes, REST has been around as a concept, in fact, it was probably *the* concept before the XMLRPC folk started in and laid the foundation for SOAP. I think we're actually lucky that WMS turned out as it did. Peter - Holy cow! I just took my first look at OWS Common in a long time. I mean http://foo.org/mywms should have been a perfectly good WMS capabilities URL. I guess that wouldn't neccesarily improve searchability, but it would make it far easier to get results than having to try all the variations of GetCapabilities that have sprung up. Getting back to the upcoming OWS Common... I don't exactly see iPod-like zen. Who's going to implement that stuff!?!?Re: Why Does WFS Dislike the Web?
Author: Panagiotis (Peter) A. Vretanos
Yes, I must admit, when I see the page count it does seem rather daunting! However, keep in mind that OWS common gathers up bits and pieces that might be common to a bunch of OGC services ... like BBOX. Theortically it should make the other specification shorter since they only need reference OGC common!?Re: Why Does WFS Dislike the Web?
Author: Steven Citron-Pousty
REST is the anti-SOAP. I know they try to do the same types of things, it's just that SOAP brings all this extra baggage that I never needed. I like REST for it's ease of use. Give me tomcat and dom4j and I am good to go...Re: Why Does WFS Dislike the Web?
Author: Chris Tweedie
Instead of using status codes, why not propose using the same mimetype method as WMS? Even though having different mime types for seemingly text/xml responses is a pain on occasion, its times like these that it makes detection so much easier. I'd vote +1 for WFS application/vnd.ogc.se_xml (or something similar).REST and OGC
Author: Paul Ramsey
I have a theory about this: doing good REST requires understanding and using two or three different standards, HTTP for transport, URL/URI for location, some kind of *ML encoding and maybe MIME. If you know all those standards and spend a little time cogitating, it's hard not to end up RESTful. If you only really know *ML, then you tend to ignore things like HTTP codes or MIME types, and you get stuff like WFS, where almost all the magic happens in the content encoded bundle, and all the rest is treated as anonymous pipes. The ironic part is that OGC itself is trying to divide up its standards base into lots of little orthogonal pieces... data encoding, data styling, metadata encoding, service profiles. And it is getting swamped by things like KML that stuff all the information into one big block. WFS is web services ignoring the most of the web. KML is geodata interoperabilty ignoring much of the existing standards base.Re: Why Does WFS Dislike the Web?
Author: Allan Doyle
OGC originally was trying to steer around the OLE/DCOM vs. CORBA debate. The Simple Features spec actually had a CORBA version, an OLE/DCOM version, and an SQL version. There was always the feeling that the OGC spec had to ride one level above the actual transport since otherwise it would be too hard to move from one to another. And I'm not sure many participants in those days would have been up to speed enough on HTTP/URI/XML to come up with a real REST solution. KML seems to be swamping everything else because it's easy to hack up quick things w/o having to read very much. You can get better at KML hacking slowly, without having to break much of a sweat. I don't think you can say that about most OGC specs.Re: Why Does WFS Dislike the Web?
Author: Andrew Hallam
I've posted on a related topic, WMS Practicalities and REST, inspired by Allan's comment on making it "easy to hack up quick things".