Some reactions on the web to the Google Maps Data API announcement ...
Dammit, I told you to pay attention to this stuff.
Google announced a data API for Maps this morning in San Jose. This is basically a CRUD service for storing geodata in “the cloud” that leverages Atom in lots of ways. That didn’t sound very world-shaking to me at first since there aren’t even any basic spatial query functions, but there are some ways in which this could be a game-changing service — if you trust Google to be your data custodian.
I’m disappointed by this one—it’s really just a CRUD store for the KML files used in Google MyMaps. It would be a lot more useful if it let you perform geospatial calculations against your stored map data using some kind of query API—a cloud service alternative to tools like PostGIS.
Geo data can get very large very quickly. Serving it can get expensive. This Data API will help NGOs, non-profits and developers make their data available without breaking the bank. Google's goals for doing this are obvious. If the data is on their servers they can index it easier and make it readily available to their users. There will be concern that Google will have too much of their data, but as long as Google does not block other search engines and allows developers to remove their data I think that this will be a non-issue.
Come and learn how lat49 and geocommons no longer have business models at the Google Booth
The code that will become Shapely 1.1 is mostly in working (but not by any means production ready) order and ready for testing by folks that are interested in prepared geometries or new plugins. Documentation and doctests are not up to date, though the unit tests are largely passing, and with good coverage of most modules. It's not been tested on Windows at all, but might work with some coaxing.
Coordinates now are stored primarily as Python arrays of floats. GEOS geometries are created as needed and cached, with some positive and negative effects on performance. Python access to coordinates is much more efficient than it has been. Chaining of operations is less efficient because we're unnecessarily creating Python arrays for anonymous geometries. For example, the following code
benchmarks like this (in microseconds):
1.1a1: 873 usec/pass 1.0.12: 957 usec/pass
but this code, with operation chaining
1.1a1: 1931 usec/pass 1.0.12: 271 usec/pass
I have some ideas about how not to load Python arrays in the case of anonymous geometries that should address this issue.
Shapely 1.1 has a new architecture. Most users won't even notice, but methods of geometries now call on entry points of plugin packages. The default plugin package is shapely.geos, but you can switch via a function in shapely.implementation as you need.
There are a few known bugs and changes to look out for:
Adaptation of the geo interface is broken (no asShape()).
Heterogeneous geometry collections are broken.
Prepared geometry module shapely.prepared needs to be moved to shapely.geos.prepared (only available if you have GEOS 3.1 anyhow).
We're now expecting coordinates to come from Numpy as arrays of x's, y's, and z's which means that Shapely integrates with Numpy in the same manner as matplotlib.
To dive in and try it out (with GEOS 3.0+), I suggest you make a fresh Python 2.5 virtualenv and easy_install the shapely.geos package
Being a dependency of shapely.geos, Shapely is automatically fetched. If you'd like to hack on them (patches welcome!), check the code out and install them into the virtualenv in development mode
(try-shapely)$ mkdir src; cd src (try-shapely)$ svn co http://svn.gispython.org/svn/gispy/Shapely/trunk Shapely (try-shapely)$ cd Shapely (try-shapely)$ python setup.py develop (try-shapely)$ cd .. (try-shapely)$ svn co http://svn.gispython.org/svn/gispy/shapely.geos/trunk shapely-geos (try-shapely)$ cd shapely-geos (try-shapely)$ python setup.py develop (try-shapely)$ cd ../..
Despite all the changes inside, the interface (with a few exceptions) is the same as 1.0:
(try-shapely)$ python Python 2.5.2 (r252:60911, Dec 23 2008, 09:29:43) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from shapely.geometry import Point >>> p = Point(0.0, 0.0) >>> p.wkt 'POINT (0.0000000000000000 0.0000000000000000)'
Having finally made the time and excuse to develop (virtualenv + Pylons) and deploy (with appengine-monkey) a very simple application on Google's App Engine, I'm thinking a little more about the support GAE could offer for geospatial apps. My keytree package works just fine on GAE (with xml.etree), so there's handy KML reading and writing. Simplejson, minus speed-ups, is good to go with GAE, and therefore so is geojson. Neither contain many files and so there's no harm in including them in each app (Pylons and its dependencies, on the other hand, eat up 80% of your file quota all by themselves). What's lacking is support for geometry operations and predicates, whether in the datastore API (ala GeoDjango/PostGIS), or in Python ala Shapely.
I can't believe that GEOS will ever be a part of GAE, but support for Java opens the door for JTS. I hope Martin Davis or somebody at OpenGeo grabbed one of the offered chances to try Java and will report on it. Would you have to use Jython to get at the JTS operations and predicates, or could you communicate via protocol buffers within your application? I do not know, being new to the platform, but would like to find out. Ideally, this kind of geospatial applications platform would allow you to create your own services for analyzing spatial features and share them with other applications if it wasn't going to provide them itself. Going over HTTP to a JTS-based geometry/topology service hosted in the same cloud would suck.
Look what I found in KML while looking for hyperlinks: a RESTful service description format. A neat find because we need more working examples of service descriptions.
KML's "NetworkLink" isn't the most well-named element. It's not so much of a link, really, but a slot for feature data that is filled in during a user session. The data is provided by a read-only web service described by the content of the network link's "Link" element. There are 3 service flavors. A link with nothing other than a "href" element tells the client to fetch the resource identified by link[@href]. A link with an "onStop" view refresh mode tells the client to append a bounding box parameter to the service's resource identifier. Lastly, a link may give the client a URI template of sorts via the "viewFormat" element.
This is the shape of RESTful service description. The server tells the client how to make requests for feature data in the context of a particular KML document. It might use a slightly or radically different description in a different context. It might evolve the description over time within the same KML context to meet changing client needs or engineering constraints.
KML was on the cutting edge here, but has diverged from the web. Using standard links with the same semantics of those in HTML and Atom instead of an "href" would be good. Standardizing on URI templates might be a nice feature for the future, as would better separation of concerns in "Link". Specification of on what event and how often the "network link" is refreshed is mixed up with the service description. Some of the elements seem to me to belong in the network link rather than the link.
I often gripe about the KML format ("Snippet"!), but there's much good in it – the RESTful service description pattern in particular.
Update (2009-05-13): all aboard the snippet train.
Maybe what we should be thinking about is how to advance GeoWeb platforms NOT as proprietary implementations using an alchemy of REST, JSON, RSS, KML - but as Community GeoWeb Platforms that combine international standard geospatial web services with Web 2.0 technologies.
Nice try, but no. We agree about the proprietary flavor of some of the "GeoWebs" out there. Google, in particular, promotes a "GeoWeb" that is dominated by its proprietary browser and its proprietary search. Our role in this "GeoWeb" is to make KML files and expose them through sitemaps, increasing the value of Google's proprietary index in exchange for some return traffic through Google's proprietary browser. Sharecropping, in other words. Still, the OGC's SOA isn't appropriate for a "GeoWeb". There's no web there. Principles of web architecture, standard protocols like HTTP/1.1, and standard formats like JSON, Atom – and even KML – remain the way to geographically enrich the existing web.
The Generic Geometry Library (via Mateusz Loskot) looks interesting. Algorithms operating on coordinate arrays. Optional geometry classes. Shapely's new plugin framework was motivated by my interest in Python (or Java or .NET) implementations, but a plugin using a C or Cython interface to ggl could be neat.
Uploaded to http://pypi.python.org/pypi/Shapely/1.0.12. This release fixes a reference counting bug that caused your Python process to grow when executing code like:
As Randy George finds out, the notion of a "GeoRSS format" is confusing:
There seems to be some confusion about GeoRSS mime type - application/xml, or text/xml, or application/rss+xml, or even application/georss+xml show up in a brief google search? In the end I used a Virtual Earth api viewer to consume the GeoRSS results, which isn’t exactly known for caring about header content anyway. I worked for awhile trying to get the GeoRSS acceptable to OpenLayers.Layer.GeoRSS but never succeeded. It easily accepted static .xml end points, but I never was able to get a dynamic servlet endpoint to work. I probably didn’t find the correct mime type.
There is no GeoRSS media type. Atom's is application/atom+xml. RSS 2.0 uses application/rss+xml, and RSS 1.0 might use application/rdf+xml. You should use one of those depending on whether your GeoRSS elements appear in an Atom, RSS 2.0, or RSS 1.0 feed. Why? Atom has its own processing model. When you request a feed, the server writes application/atom+xml in the response's content-type header to inform you that you'd better use the Atom content processing rules if you want to make sense of the data.
Is there anything to be gained by having a special GeoRSS media type that overrode the Atom or RSS media types?