2008 (old posts, page 19)

Change

Update (2008-11-07): Restoring the country's economy, confidence, dignity, and reputation will be tough; I expect to be disappointed along the way, but right now I am enormously pleased. Fort Collins, Larimer County, Colorado, and the United States of America expressed enough hope over fear, and rightly held the GOP accountable for getting us into this ditch.

http://www.barackobama.com/images/widgets/Obama08_ThumbLogo200.gif

After 7 disastrous years of mal-intent and incompetence, let's give the hopeful and serious a chance. It's time for a change.

Shapely 1.0.8

Paul Winkler and the OpenGeo team found a problem with Shapely and Zope's web server on their 64-bit system, provided the fix, and now here's Shapely 1.0.8. An upgrade is recommended if you're using Zope or a similar framework.

Comments

Re: Shapely 1.0.8

Author: Paul Winkler

Hi Sean, thanks for the kind words and getting an official fix out so quickly. I must give credit where due: All the hard gdb sleuthing was led by my coworkers David Turner and Douglas Mayle. I don't have the necessary C/C++ chops to have solved this on my own. Oh, and to clarify, the problem had nothing to do with Zope (or Grok in our case), this upgrade should be good for anyone running on a 64-bit system. I was eventually able to write a standalone script that demonstrated the segfault. The same problem could afflict any python package that uses Ctypes on 64-bit systems without explicitly marking argument and return types.

Multiprocessing with Rtree

I came up with an example of how Python's multiprocessing package -- standard in 2.6, and recently backported to 2.4 and 2.5 (on PyPI) -- could be used to set up a simple R-Tree index server:

from multiprocessing.managers import BaseManager

class RtreeManager(BaseManager):
    pass

RtreeManager.register('add')
RtreeManager.register('intersection')

if __name__ == '__main__':

    from rtree import Rtree

    class NoisyRtree(Rtree):
        def add(self, i, bbox):
            print "Adding: %s, %s" % (i, str(bbox))
            Rtree.add(self, i, bbox)
        def intersection(self, bbox):
            print "Searching: %s" % str(bbox)
            return Rtree.intersection(self, bbox)

    index = NoisyRtree('processing')

    RtreeManager.register('add', index.add)
    RtreeManager.register('intersection', index.intersection)

    manager = RtreeManager(address=('', 50000), authkey='')
    server = manager.get_server()
    print "Server started"
    server.serve_forever()

Run that module (man.py) as a script to start the server, and access the index from Python in a new process like this:

>>> from man import RtreeManager
>>> manager = RtreeManager(address=('', 50000), authkey='')
>>> manager.connect()
>>> print manager.intersection((-20.0, -20.0, 20.0, 20.0))
[1L, 2L, 5L]

Three items were already in my index, persisted on disk. One more can be added like this:

>>> manager.add(4, (-10.0, -10.0, -9.0, -9.0))
<AutoProxy[add] object, typeid 'add' at 0x-483da894>
>>> print manager.intersection((-20.0, -20.0, 20.0, 20.0))
[1L, 2L, 5L, 4L]

The manager synchronizes access so additions and queries from different processes don't clobber each other.

Geo-enabling CouchDB

Interesting. It occurs to me that SQLite, even, is overkill for this purpose. A relational database isn't necessary and the unavoidable duplication of data is undesirable. All that is needed is an R-Tree index and, eventually, the capacity to use spatial operators and predicates in views. The Rtree package that Howard and I have been working on could serve well. It need store only the keys of CouchDB documents, so there's no duplication of spatial data. SpatiaLite (which appeals to me in many ways) uses the same geometry library as Shapely does (GEOS), so a Python query using Rtree and Shapely has essentially the same implementation as an OpenGIS SQL query of SQLite/SpatiaLite. I suggested in comments that GeoCouch might want to take advantage of the GeoJSON group's work, on geometry formatting at least.

Via Simon Willison.

Comments

Re: Geo-enabling CouchDB

Author: Jan Lehnardt

Hi, that's what we are thinking :) CouchDB's external-indexing API is fairly simple. What Volker did is a proof of concept of how to get something up and running as fast as possible. Directly working with RTree would be pretty cool. Are you up for some collaboration? :) Cheers Jan --

Re: Geo-enabling CouchDB

Author: Volker Mische

Hi, I actually tried the R-Tree package first, then I found SpatiaLite. I wanted to keep it as minimal as possible and as you say, I only need an R-Tree. The reason for switiching was file locking. There might be an easy workaround for having one script having right access and another one read only access, mine was using SpatiaLite as it offers equal functionality with not that much overhead (and as Jan aready said, I wanted to get something running asap). Another reason that was that the sptialindex kind of broken when I added to manay point >300000 iirc (though I haven't tested it with SpatiaLite yet). When I can overcome these issues I'd be happy to give R-Tree package another chance and see if I can work it out, as it really does everything I need (especially Shapely). About GeoJSON: sounds like a good idea, though I might use only GeoJSON-like structure to make lat/lon vs. lon/lat more obvious. I don't want to confuse people as I will use lat/lon in the future, and the prefered GeoJSON coordinate order is lon/lat.

Re: Geo-enabling CouchDB

Author: Sean

I can't fault anybody for using a working solution and getting things done, and had no idea you'd tried Rtree already. I'm using it in an application, one that doesn't have to be as concurrent as yours. Do you think it would be helpful to queue Rtree access, perhaps using multiprocessing's Manager class? Having asked the question, I really must try it. About GeoJSON coordinate order: we got it right :) What you get out of following it is easy integration with OpenLayers, Shapely, etc. I also can't deny that if all you have is points, {"lat": 0.0, "lon": 0.0} is most clear ... but only until you encounter different coordinate systems.

Re: Geo-enabling CouchDB

Author: brentp

knowing very little about couchdb, except that it has uses btree, this seems like the good application for geohash.

Re: Geo-enabling CouchDB

Author: Volker Mische

Sean, yes I know that you got the coordinate order right. I might use something like coordinate_order and make the default order lon/lat, as I want to support more than just points. But that's a detail, I'd rather spend some time on trying the R-Tree package again :)

Zotero Update

The key paragraph from George Mason University's response to the Reuters lawsuit, via Dan Cohen:

The Thomson Reuters Corporation has sued the Commonwealth of Virginia over Zotero, a project based at George Mason University’s Center for History and New Media (CHNM). A free and open-source software initiative, Zotero aims to create the world’s best research tool and has already been adopted by hundreds of thousands of users at countless colleges and research universities. CHNM announces that it has re-released the full functionality of Zotero 1.5 Sync Review to its users and the open source community. As part of its formal response to this legal action, Mason will also not renew its site license for EndNote.

The Geospatial-Military-Industrial Complex blogs

The got geoint? blog aims to be "fun" and "hip". I'm not the only one who's skeptical. There's no rule of "Web 2.0" that says one must be fun and hip. Grim, steely resolve is fine too. Be yourselves. Perhaps "fun and hip" is cover, but having announced that the blog will be fun and hip, the cover is blown ... or is it? An overly-enthusiastic blog could be cover, deep cover, for something else. I'm just saying. Will it emulate fun and hip security-related blogs like Danger Room, Threat Level, and Boing Boing? Will there be cats and squid -- giant squid with low-light video cameras and microphones -- on Fridays? Satellite imagery of James Fee clipping his toenails on the back doorstep in his plaid flannel bathrobe? YouTube videos of Steve Coast running from drone aircraft, followed by a too-funny post about submitting his futile panicked route to OSM? It's not like the military-industrial complex can't afford the fun that would take a blog to the next level.

Happy birthday, Ursula K. Le Guin

79 today. Geography was a key ingredient in her Earthsea novels (which I read again earlier this year, and the Other Wind for the first time). Did you know that her website has maps of Earthsea?

http://farm4.static.flickr.com/3234/2961979462_6353fecd06_d.jpg

Is that crayon? I've been looking at a lot of crayon coloring lately, and would swear it is. Cool! Maps of Archaic Latium from her new novel, Lavinia, are also available from her site.

Comments

Re: Happy birthday, Ursula K. Le Guin

Author: matt

I just reread the series this year too. I loved the maps, finding myself thumbing back to them just about every time locations are mentioned in the novels. The geography is extremely important to the storyline.

ORE 1.0

What Pete Johnston said:

As it happened, I was talking about ORE in a presentation last week (more on that in a follow-up post) and I expressed the opinion then that, leaving aside for a moment the core ORE model of Aggregations and Aggregated Resources, I think one of the significant contributions of ORE may turn out to be its emphasis on what I think of as a "resource-centric" approach and (at least some of) the conventions of the Semantic Web and "Linked Data" communities. In particular, I think this is a potentially important change for the "Open Archives"/"eprint repository" community, where to a large extent - not entirely, but to a large extent - repository developments on the Web have been conditioned by the more "service-oriented" framework of the OAI-PMH protocol and an emphasis on XML and XML Schema. It's also probably fair to say that I don't think the ORE project really started from this perspective, but rather things evolved and shifted - perhaps not always in a straight line! - in this direction as the work proceeded.

The discussion around ORE opened minds all around: I was clued in to Linked Data and had my interest in RDF rekindled; the ORE authors came around to embracing the practices of the Atom community.

Long, lonely tail

I recognize the incongruity of my linking to A-lister Nick Carr's entry about the dwindling long tail:

Chris Anderson's "long tail" remains an elegant and instructive theory, but it already feels dated, a description of the web as we once imagined it to be rather than as it is. The long tail is still there, of course, but far from wagging the web-dog, it's taken on the look of a vestigial organ. Chop it off, and most people would hardly notice the difference. On the web as off it, things gravitate toward large objects. The center holds.

This reminds of the geospatial/geoweb community's fascination with "top 25" lists and preference for popular blogs over idiosyncratic blogs.

Comments

Re: Long, lonely tail

Author: Allan Doyle

We love your idiosyncratic blog...

Re: Long, lonely tail

Author: Sean

Allan, thank you, but I was referring to under-appreciated C-listers. I get more than my fair share of readers.