RESTful Feature APIs
Update (2007-04-10): a related demo.
Update: added examples of query results.
My previous post about WxS, RPC, and REST raised a few questions about whether queries fit into a RESTful GIS. The answer is: yes, queries remain indispensible. Indexes are a valuable part of your GIS, and a query API provides web agents access to the indexes.
Consider a very minimal municipal GIS which tracks properties, or parcels. Each parcel has many attributes, and the GIS indexes, at the very least, the following: a unique parcel id, the name of the parcel's owner, and the geospatial footprint of the parcel. These indexes allow a user to efficiently find all properties owned by an individual, or find all properties potentially impacted by construction along a particular path.
In a RESTful GIS, each parcel is a resource, and has a URL like:
http://example.org/parcels/[id]
Dereferencing the URL http://example.org/parcels/1 returns a representation, JSON in this case:
{ "id": "1", "owner": "Homer Simpson", "geometry": { "type": "Polygon", "coordinates": [[...], ...] }, ... }
The "parcels" feature type is itself a resource. A useful representation of this resource would be a collection that includes URIs for, and data about, individual parcel resources -- all marshalled directly out of the GIS's indexes:
{ "parcels": [ { "id": 1, "uri": "http://example.org/parcels/1", "owner": "Homer Simpson", "bbox": [1000.0, 1000.0, 1001.0, 1001.0] }, { "id": 2, "uri": "http://example.org/parcels/2", "owner": "Ned Flanders", "bbox": [1001.0, 1001.0, 1002.0, 1002.0] }, ... ] }
That collection might easily include the precise footprints of properties, but we'll simply consider bounding boxes here.
A query API should return criteria-based subsets of that collection, leveraging the system's indexes. Which properties are going to be condemned to make way for the new monorail?:
GET /parcels/?bbox=0,0,2000,2000
The answer is: the parcels with URIs http://example.org/parcels/42 and http://example.org/parcels/83:
{ "parcels": [ { "id": 42, "uri": "http://example.org/parcels/42", "owner": "Moe Szyslak", "bbox": [...] }, { "id": 83, "uri": "http://example.org/parcels/83", "owner": "Kwik-E-Mart Corporation", "bbox": [...] } ] }
Which properties are suffering catastrophic loss in value?:
GET /parcels/?adjacent(owner(Simpson))
The answer is: the parcel with URI http://example.org/parcels/2:
{ "parcels": [ { "id": 2, "uri": "http://example.org/parcels/2", "owner": "Ned Flanders", "bbox": [...] } ] }
The specific query parameters or URL templates to use are an implementation detail that I won't get into here (OpenSearch seems promising).
The gist of all this is that a RESTful feature query returns key, indexed data about feature resources along with a URI to the feature resources themselves in the same way that a Google Search returns data from its index, with links, instead of dumping the entire Web into your browser.
Comments
Re: RESTful Feature APIs
Author: Paul Ramsey
What happens when I have 2000000 parcels? Unlike web pages, "databasey" resources don't get automatically scaled to be of reasonable size. Do I end up with a system where I only want people to talk to the data via the query mechanism, because anything else would be too clumsy? At that point, who cares that I have resources?Re: RESTful Feature APIs
Author: Sean
Paul, I don't understand what you mean by automatic scaling, and I don't understand quite where your concern about large quantities comes from. Are you talking about querying against millions of parcels, or about rendering millions of parcels into an image?Re: RESTful Feature APIs
Author: Paul Ramsey
I mean, when I have 2M parcels, doing GET /parcels no longer returns me something remotely useful -- it will be too big, too slow, or both. I am forced to use the query API to do any useful action with the data.Re: RESTful Feature APIs
Author: Christopher Schmidt
Paul: I don't see a URL of /parcels/ on its own anywhere in this post. There's no need to get a list of everything: when you want a larger-than-one subset of the data, you do a query via the query mechanism. The point is that each parcel has *a* URL: so when I query ?bbox=0,0,10,10, I get a list of parcels back, which I can always address in the future to get all the information about *a* feature back. So the answer to your question is probably "yes": You always find the list of parcels you're interested in via the query mechanism, but once you have it, you can put it anywhere else you want. At least, that's what I understand. I don't have any GIS data to speak of. :)Re: RESTful Feature APIs
Author: Jason Birch
I actually had the same question, and picked it up from this quote: " The 'parcels' feature type is itself a resource. A useful representation of this resource would be a collection that includes URIs for, and data about, individual parcel resources " As much as I agree that it would ultimately be most useful if the parcels resource returned a complete list (it kinda reduces the discoverability of the resources if you don't) I can't see this working for me even with a small volume of parcels (35,000). In my playing around, I think I'm going to do is return an HTML formatted page, with OpenSearch links, and also with form elements describing all of the search APIs. The only restriction with the OpenSearch links that I can see is that it assumes that you apply a different URI for each content-type that you return rather than using an appropriate "Accepts" header. The only workaround that I can see is hacking an "&force_type=application/x-json" (or whatever the correct content type is) parameter at the end of the string. This seems a bit RESTless though... I guess if this is combined with proper header sniffing for intelligent clients, it's acceptable though? I think that with this strategy it would be relatively easy to provide JSON, GeoRSS, and HTML (microformat too) versions of the individual resources. I think I'll also include alternate links in the html representations, pointing to the JSON and GeoRSS representations, and maybe also to an image/png representation for a quickie map of the parcel. Hmm. For the JSON representation and search results, what kind of representation would work best for GeoJSON? EWKT? JasonRe: RESTful Feature APIs
Author: Sean
You're right that the HTML representation of a feature type shouldn't be a list of thousands or millions of links, but if you want to get in Google's spatial index of the Web you will need a representation that does list everything. In my case, a KML variant serves this purpose, and the default HTML is simply a list of links to subsets of the full listing. The JSON content type is application/json, and consensus is building around geometry coordinates expressed as arrays (or arrays of arrays) of numeric values instead of WKT. I don't recommend getting too wound up in content negotiation until we have user agents that actually show a preference. Google Earth, for example, doesn't give KML a higher q value than it gives HTML.Paging anyone?
Author: Mark Fredrickson
Why not just page the results? This is the technique that keeps HTML pages to manageable size. Provide a URI for the next page of data as part of the collection, and client systems should be able to pull down the next page without too many problems. That's the great thing about URIs in a RESTful implementation - they can be used in countless ways because their well understood. Of course, this is in the abstract, but I think it's a good place to start looking.Re: RESTful Feature APIs
Author: Mark Fredrickson
As an example of paging, take a look at the ATOM spec (the canonical REST reference implementation): http://bitworking.org/projects/atom/draft-ietf-atompub-protocol-14.html#partial-listsRe: RESTful Feature APIs
Author: Sean
Yeah, that's what I've been looking at too.