The right profile

What's the difference between GeoJSON and plain old RFC 4627 JSON? It seems like a naïve question, but it's actually rather important. GeoJSON data is JSON data, following all the rules of RFC 4627, but has additional constraints. The most significant of these are:

  • GeoJSON object must have a 'type' member.

  • The value of that member must be one of the following: "Point", "LineString", "Polygon", "MultiPoint", "MultiLineString", "MultiPolygon", "GeometryCollection", "Feature", or "FeatureCollection."

  • The items of a FeatureCollection will be arrayed in its 'features' member.

  • A Feature's representation in mathematical space is expressed in its 'geometry' member.

  • A geometry's coordinate values are arrayed in X (or easting), Y (or northing), Z order.

Say you're writing a HTTP client that requests JSON data from a server and processes it. How does it "know" how to find the features in the data or the geometries of those features? If there's exactly one server involved and it provides only GeoJSON data, you can have a coupled, but reliable client-server system where the way the data is processed depends on the address of the server. But if there are multiple JSON APIs involved the brittleness of this kind of system quickly increases. Just having a JSON schema isn't a silver bullet because schemata aren't standardized in RFC 4627; the question then becomes how does the client "know" where to find the schema?

I've written before that GeoJSON might need its own media type to be a good citizen of Formatland, something like application/vnd.geojson+json. GitHub (for example) does this. Lately, I'm persuaded by people like Mark Nottingham who argue that a flatter space of fewer media types is better. GeoJSON's extra constraints are only about the structure and interpretation of the data, they don't affect parsing of the data at all. An application/json parser need not skip a beat on GeoJSON and a generic application/json application can do plenty of shallow processing on it.

The profile concept from the 'profile' Link Relation Type I-D seems to suit GeoJSON well and I'm rooting strongly for this draft. If and when it is finalized, we can declare GeoJSON to be just a profile of application/json and clients can use the profile declaration to make more sense of data instead of out-of-band information or sniffing and guessing.

Profiles could also be a means for making sense of the various profiles of GeoJSON already found in the wild today. I've come up with a short list of five, apologies for any omissions.

Microsoft's OData removes one constraint from GeoJSON and adds two more:

Any GeoJSON value that is used in OData SHOULD order the keys with type first, then coordinates, then any other keys. This improves streaming parser performance when parsing values on open types or in other cases where metadata is not present.

The GeoJSON [GeoJSON] standard requires that LineString contains a minimum number of Positions in its coordinates collection. This prevents serializing certain valid geospatial values. Therefore, in the GeoJSON requirement “For type ‘LineString’, the ‘coordinates’ member must be an array of two or more positions” is replaced with the requirement “For type ‘LineString’, the ‘coordinates’ member must be an array of positions” when used in OData.

All other arrays in GeoJSON are allowed to be empty, so no change is necessary. GeoJSON does require that any LinearRing contain a minimum of four positions. That requirement still holds that LinearRings can show up only in other arrays and that those arrays can be empty.

GeoJSON allows multiple types of CRS. In OData, only one of those types is allowed. In GeoJSON in OData, a CRS MUST be a Named CRS. In addition, OGC CRS URNs are not supported. The CRS identifier MUST be an EPSG SRID legacy identifier.

Madrona's KML-ish nested feature collections are another profile.

Leaflet examples, like that for L.GeoJSON, hint at a profile where features have 'title' and 'description' properties.

Pleiades has its own profile of GeoJSON in which we add a representative point member to features, an anchor for labels and popups. The Pleiades profile also has 'title' and 'description' properties.

The 'geo' member in Twitter's 1.1 API uses a classic profile of GeoJSON, the "surely you mean latitude, longitude order!" profile.

In anticipation of the 'profile' Link Relation Type I-D getting past the IETF and IANA hurdles, I've created a strawman proposal for how GeoJSON might use it: https://gist.github.com/3853624. Please check it out, comment, or fork it if you've got your own profile of GeoJSON and are interested in expressing it using profile links.