Does Pleiades have an API?

This is a becoming a frequently asked question, and as I work on the definitive answer for the Pleiades FAQ, I'll think out loud about it here in my blog. Does Pleiades have an API? In truth, it has a number of APIs, some good and some bad. Does it have a HTTP + JSON API like all the cool kids do? No. Well, yes, sort of.

Before I get into tl;dr territory, I'll write down one of the guiding principles of the Pleiades project:

Data is usually better than an API.

It's not that we're uncomfortable with interfaces in Pleiades. Our application is based on Zope and Plone, so you know it has all kinds of interfaces under the hood. I'm even a bit of a geek about designing nice APIs (see also Shapely, Fiona, etc). It's just that data is better ... usually.

By "data" above, I mean a document or file or sequence of bytes containing related information, in bulk. The entire text of a book, for example, is better to have than an API for fetching the N-th sentence on page M. All the coordinates of a simple feature linestring (as GeoJSON, say) are better to have than an API for getting the N-th coordinate value of the M-th vertex of a line object. Given all the data, we're not bound to a particular way of indexing and searching it and can use the tools of our choice. APIs are typically chatty, slow and pointlessly different from others in the same line of business. Subbu Allamaraju goes deep into the trouble of working with inconsistent systems in "APIs are a Pain" and with more hard earned wisdom than I have, so I won't pile on here. Data is better ... usually.

An API, and here I mean "web API", can be better in the following and probably not exhaustive list of situations:

  • Sheer mass of data making dissemination practically impossible

  • Rapidly changing data making dumps and downloads out of date

  • Desire to control access to individual data records

  • Desire to monetize data (ads, for example)

  • Desire to impose a certain point of view

  • Desire to track use

Tracking use lets us tweak the experience of users. "People who viewed record M might also be interested in record N" and the like. It doesn't have to be nefarious tracking, just nudging users into useful and mutually profitable patterns. Only one of these situations is very relevant to Pleiades and so we're not designing APIs to sort them all out like other enterprises must. The RDF and KML serializations of the entire 34,000 place Pleiades dataset are not large by modern standards and don't change very rapidly. An application (like the Pelagios Graph Explorer or GapVis) that fetched and cached them once a day could stay quite up to date. The number of Pleiades contributors is growing, but they are primarily enriching existing places; I don't expect Pleiades to ever become so large that those files couldn't be transferred in less than a minute on a good internet connection. We control access to data that's in development, yes, but the locations, names and places that pass through review into a published state are completely open access and not private to any individual user or group of users. In only one part of Pleiades are we concerned about controlling a narrative through an API: the slideshow that plays on the Pleiades home page uses an API that stumbles through the most recently modified places and progressively mixes in more randomly selected ones.

Instead of fancy APIs, then, we have boring CSV, KML, and RDF downloads. The shapefile format, by the way, is inadequate for our purposes. Information will be lost in making a shapefile from the Pleiades model (any number of locations and names per place) and we're going to let people decide for themselves what to give up if they want this. The downloads are updated daily.

Pleiades also has JSON, KML, and RDF data for any particular place. Data that is current and linked from every page (http://pleiades.stoa.org/places/422987, for example) with HTML <link> and <a> elements. It's not an API ... or is it? The map on the page about Norba gets its overlay features from those very same JSON and KML resources. Looking at it in this way, you could say we do have an API here: the web is the API. When I finally finish the Pleiades implementation of OpenSearch (with Geo extension by Andrew Turner), I can replace Plone's crufty search API with even more consistency and interoperability from The Web as API.

Pleiades doesn't need the same kind of API that Twitter or Facebook have (obviously) or that OpenStreetMap has. We simply don't have anywhere near that much data, that much churn or (in the Twitter/Facebook case) that much need to control what you access.

Comments

Re: Does Pleiades have an API?

Author: josh livni

Another reason for an API would be a desire to allow adding new data or modifying a subset of the data, using different tools than your default web ui, no?

Re: Does Pleiades have an API?

Author: Sean

Maybe ... edits change everything (so to speak), so I'll have to mull that over. There are certainly other ways to incorporate changes that don't involve web APIs: diff and patch, for example, or git.