ORE 1.0

What Pete Johnston said:

As it happened, I was talking about ORE in a presentation last week (more on that in a follow-up post) and I expressed the opinion then that, leaving aside for a moment the core ORE model of Aggregations and Aggregated Resources, I think one of the significant contributions of ORE may turn out to be its emphasis on what I think of as a "resource-centric" approach and (at least some of) the conventions of the Semantic Web and "Linked Data" communities. In particular, I think this is a potentially important change for the "Open Archives"/"eprint repository" community, where to a large extent - not entirely, but to a large extent - repository developments on the Web have been conditioned by the more "service-oriented" framework of the OAI-PMH protocol and an emphasis on XML and XML Schema. It's also probably fair to say that I don't think the ORE project really started from this perspective, but rather things evolved and shifted - perhaps not always in a straight line! - in this direction as the work proceeded.

The discussion around ORE opened minds all around: I was clued in to Linked Data and had my interest in RDF rekindled; the ORE authors came around to embracing the practices of the Atom community.

Long, lonely tail

I recognize the incongruity of my linking to A-lister Nick Carr's entry about the dwindling long tail:

Chris Anderson's "long tail" remains an elegant and instructive theory, but it already feels dated, a description of the web as we once imagined it to be rather than as it is. The long tail is still there, of course, but far from wagging the web-dog, it's taken on the look of a vestigial organ. Chop it off, and most people would hardly notice the difference. On the web as off it, things gravitate toward large objects. The center holds.

This reminds of the geospatial/geoweb community's fascination with "top 25" lists and preference for popular blogs over idiosyncratic blogs.

Comments

Re: Long, lonely tail

Author: Allan Doyle

We love your idiosyncratic blog...

Re: Long, lonely tail

Author: Sean

Allan, thank you, but I was referring to under-appreciated C-listers. I get more than my fair share of readers.

The hypertext constraint

Roy Fielding:

There are probably other rules that I am forgetting, but the above are the rules related to the hypertext constraint that are most often violated within so-called REST APIs. Please try to adhere to them or choose some other buzzword for your API.

A must read for all "REST API" designers.

Second-guessing Project Bamboo

How can we advance arts and humanities research through the development of shared technology services?

This is the question that the Bamboo Project seeks to answer. Last week, colleagues and folks I met at THATCamp were at the second Bamboo workshop, and I've been following their comments on Twitter. Count me as a skeptic. Keep in mind that I'm also a bit of a carpetbagger: a programmer with a math and physics background working in the humanities, not a humanities researcher. Unlike most digital humanities researchers, however, I have had experience working with broken legacy architectures.

The creation of computer network architectures is fraught with social and technical difficulties and architects are almost certain to get it wrong the first time (Gopher, for example). They might even get it wrong the second time around (WS-*, for example). The GIS community's standard service architecture (OGC W*S) is terribly wrong for the Web, and I fear a repeat in the digital humanities. It's not that the designers lack for domain expertise or goodwill, just that it's a perilous enterprise. The architectures we design today will likely be wrong for, or feel wrong for, or smell wrong to, the digital humanities programmers and integrators of the future. There's a lot of risk for Bamboo here, especially if the disdain for the "Wild West" Web evident in the proposal leads to needless reinvention.

More important than service infrastructure for the advancement of these fields is education and skills. Let's graduate students and future scholars with competency in computing, including:

  • XML
  • High level programming languages (Python, Ruby, etc)
  • Web architecture, HTTP, HTML
  • Formal logic
  • RDF (and RDFa)

They'll be perfectly equipped to design exactly the services they need based on proven architectures like the Web.

Update (2008-10-20): via Tom Elliot, I realize I've rediscovered some of the skill set proposed by Lisa Spiro.

Update (2008-10-21): Hugh Cayless reports on the second workshop.

Beers and Python GIS in Praha

One of the highlights of this trip was the chance to meet fellow Python and GIS programmer, and blogger, Jáchym Čepický.

http://farm4.static.flickr.com/3235/2885863662_3b4aba9659_d.jpg

I'm not sure how many readers of Planet OSGeo understand Czech and know that Jáchym went through a scary episode in life this past summer. He gave me a sobering recap and then we had some less sobering toasts to his good health at Pivovarský klub. In a city served almost exclusively by single-beer pubs, Pivovarský klub is an oasis for the beer enthusiast. I had several beers from the tap that I never saw anywhere else -- beers far more fresh and flavorful than the ubiquitous Gambrinus.

Comments

Re: Beers and Python GIS in Praha

Author: jachym

Hi Sean, I hope, all your jouneys are going well. Btw, see my "transformations" from this summer http://les-ejk.cz/2008/07/promeny/ :-) See you somewhere sometime again Jachym P.S. I personally thing, Gambrinus is a good beer. It is good just for normal drinking - do not forget, we (czechs) consume the biggest amount of beer/head on the world. But there are others, more tasty ;-)

Re: Beers and Python GIS in Praha

Author: Kristian

Whoa, you don't need to read Czech to understand that :( Hope it was "just" a scare. I heard Jáchym's talk on Python WPS at FOSSGIS in April; it was very interesting and he seems like a very nice guy - speaks way better German than me as well :) Now you're mentioning beer, have you seen this? ;)

Friends don't let friends use EndNote, part 2

I've just seen mention of this via Twitter, but haven't discussed with or seen any statement from the developers, on advice of lawyers, no doubt: Reuters sues GMU for no real reason other than Zotero's competition with, and besting of EndNote. Is this the kind of company you want to support, Steve? One that seems ready to sue its users for trying to free their data from its proprietary format?

More from Bruce D'Arcus, and also Hugh Cayless.

Comments

Re: Friends don't let friends use EndNote, part 2

Author: Steven Citron-Pousty

That is really shitty - I can still like the software without loving the company - especially since the company is now Reuters. I thought they were owned by the people that produced science citation index. Has reuters bought them as well. I despised that company as much as Springer-Verlag. Such a frickin rip-off but I loved their content. Anyway, all I really need is a way to export my endnote library and I would be off them. I really do like my older versions of endnote - sniff sniff...

OpenLayers and 900913

Thanks to some hand holding from Chris and Josh, Pleiades now has Spherical Mercator maps using the Google physical geography layer as a stand-in for our ideal ancient world base map. See http://pleiades.stoa.org/places/639166. Coordinate transformation is provided by the code I blogged yesterday.

Comments

Re: OpenLayers and 900913

Author: Jason Birch

Doesn't seem to work in IE7 Sean...

Re: OpenLayers and 900913

Author: Sean

You're right, and not just because I hate IE.

Adding pyproj to a buildout

Pyproj is Jeffrey Whitaker's Python interface to PROJ.4. I'm no longer interested in other Python projection packages. Its most interesting feature is interoperability with packages (such as Shapely) that use the Numpy array interface. It depends on Cython, which makes it a bit tricky to include in a buildout: you must install Cython into your buildout's python, not as an egg, and make the pyproj egg only after this step is finished. Like this:

System Message: ERROR/3 (<string>, line 3)

Cannot find pygments lexer for language "none"

.. sourcecode:: none

  [buildout]
  parts =
    cython-src
    cython-install
    pyproj

  [cython-src]
  recipe = hexagonit.recipe.download
  url = http://cython.org/Cython-0.9.8.1.1.tar.gz

  [cython-install]
  recipe = iw.recipe.cmd
  on_install = true
  cmds =
    cd ${buildout:directory}/parts/cython-src/Cython-0.9.8.1.1
    ${python:executable} setup.py install

  [pyproj]
  recipe = zc.recipe.egg:eggs
  index = http://atlantides.org/eggcarton/index
  eggs = pyproj

You might be able to pull pyproj off PyPI, but here I am using my own index. Once built, you can try it out using zopepy:

>>> from pyproj import Proj
>>> defn_900913 = """
... +proj=merc +a=6378137 +b=6378137
... +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0
... +units=m +nadgrids=@null +no_defs
... """
>>> proj_900913 = Proj(defn)
>>> lonlat = (25.0, 25.0)
>>> proj_900913(*lonlat)
(2782987.269831839, 2875744.6243522423)

I like that projection definitions can be split across lines of the screen.

Geojson and pyproj interop

I've just finished writing a module that supports projection of objects that provide the geometry part of the Lab's geo interface: proj.py. Maybe you'll find it useful. What it does is take a geojson geometry (or Shapely geometry) and return a new projected geojson geometry object:

>>> from pleiades.openlayers.proj import Transform, PROJ_900913
>>> transform = Transform(PROJ_900913)

>>> # Point, forward
>>> from geojson import Point
>>> point = Point(coordinates=(25.0, 25.0))
>>> fwd = transform(point)
>>> fwd
Point(coordinates=(2782987.269831839, 2875744.6243522423))

>>> # Point, inverse
>>> inv = transform(fwd, inverse=True)
>>> inv
Point(coordinates=(24.999999999999996, 24.999999999999996))

>>> # Line, forward
>>> from geojson import LineString
>>> line = LineString(coordinates=((25.0, 25.0), (30.0, 30.0)))
>>> fwd = transform(line)
>>> fwd
LineString(coordinates=((2782987.269831839, 2875744.6243522423), ...))

>>> # Line, inverse
>>> inv = transform(fwd, inverse=True)
>>> inv
LineString(coordinates=((24.999999999999996, 24.999999999999996), ...))

There's also an object hook if you'd like the transform to yield instances of your own classes.

kml:description considered harmful

KML lacks two important elements. It has neither 1) a simple and universal element equivalent to Dublin Core description (such as {http://purl.org/dc/elements/1.1/}description), nor 2) an excellent element for encapsulating or linking to rich content accompanying placemarks. Instead of the first, we have kml:Snippet. Snippet! It's somewhere in between a title and a description and that's no place to be. Instead of the second, we have kml:description, which supports an under-specified subset of HTML and other media including video.

The kml:description element is not about metadata anymore, it's about adding rich content -- charts, photos, video -- to popup windows of a geographic browsers. Under-specification of the content of kml:description is a problem. Which HTML elements are valid? Why shouldn't all be valid? The choice only between escaping HTML and using CDATA blobs is another problem. XHTML in a description wouldn't break the KML document; why shouldn't we be able to use XHTML without CDATA? Lack of support for dynamic content is yet another problem. Why can't we specify rich content by a URL, to be fetched only as needed and refreshed only if modified? Limitations of KML's description element and its ties to the original implementation hold us back. We need something better.

The Atom community wrestled with the same issues and came up with the atom:content element. An atom:content element may contain text, HTML, XHTML, XML, encoded binary media, or provide a URL to another document or media resource. When prompted, Atom processors (such as a news reader) display this content to a user. All HTML or XHTML elements are valid. CDATA blobs are unnecessary and not allowed. Atom's content element comes with clear processing instructions: HTML must be escaped, and should be valid within a <div>. The immediate child element of XHTML content must be a <div>. While guided by implementations, none of this depends on unspecified behavior of a particular Atom processor.

The "src" attribute of the atom:content element provides developers with more options. It allows you to decouple semi-static data from dynamic data. The locations and identifiers of physical instruments deployed in the field (river gauges, ASOS, you name it) can be decoupled from their nearly real-time observations and measurements. Serve a representation of the deployed instruments as a static document updated only as are the physical objects. From each instrument entry, link to a specialized web resource that produces tables or graphs on demand.

Atom processors can also take advantage of the "src" attribute to improve performance perceived by a user. Fetch the feed, show users titles and summaries first, get rich content as the user asks for it -- or get the rich content in the background, or during lulls in user activity. Processors should also be able to exercise conditional GET to avoid reloading unchanged rich content.

The atom:content element appears to support all of the requirements of, and overcome all of the faults of, kml:description. It fully embraces and accommodates rich content and provides new design options via links to dynamic or massive content. KML is already open to Atom elements (version 2.2 already includes atom:author, atom:link, and atom:name) and so atom:content is a more natural replacement for kml:description than RSS 1.0's content element or RSS 2.0's enclosure element. I'm not in a position to offer any advice to Google, but -- considering its investment in Atom, to act on the commonalities between KML and Atom seems useful and even profitable.

I've heard from KML folks that nothing pushes the spec forward more than working code. I'm not interested in writing a new geographic browser, but I can start serving KML enhanced with atom:content (which meets needs of a user that just cropped up on Monday), support it in my own KML processing packages, and perhaps even exercise my withering C++ skills on a libkml patch. After that? Let's swap in atom:summary for kml:Snippet. Snippet!

Comments

Re: kml:description considered harmful

Author: Matt Giger

I feel your pain, really I do. EarthBrowser follows the KML spec when it can but in order for it to be useful, I've changed it where reasonable. For example, if the description element starts with 'http', it loads that page or image. Don't you love how Snippet is capitalized unlike all other nodes with no children? How about that "maxLines" attribute? I've heard that "working code" suggestion to my request for improvements too, it's just an easy way to divert criticism. By handing KML over to the OGC, they got government agencies to adopt it, but they lost the ability to fix the broken parts. Sadly, I suspect that KML will die over time if it cannot be improved. I've been moving away from KML lately. It is very useful in many circumstances, however it is too rigid, bloated and confusing to author with. I've written on my blog a long time ago about how the structure of KML shows the innards of Google Earth and it isn't very impressive. Looking at the new Google Earth browser plugin API, you can see that it's internal structure remains static and ossified. GEarth, and by extension KML, is in desperate need of a re-write, I hope the OGC will grant Google permission to add some new features to it's KML based data browser.

Re: kml:description considered harmful

Author: Bryan Lawrence

Oh please, please, someone from Google read this and do the right thing (TM)!

Re: kml:description considered harmful

Author: Sean

Thanks for the comments. Matt, is this something your EarthBrowser could support? I've read some statements by Google Earth folks that don't rule out development of non-standard features related to KML, but you've had more interaction with them than I. If you're pessimistic, maybe I should dial down my optimism.

Re: kml:description considered harmful

Author: mpd

OGC does not need to "grant" anyone permission to make changes to any standard. The OGC change request procedure is open to members and non-members alike, and so Sean can go ahead and submit this request. The OGC is then obliged to act on it. Google uses the same procedure, so you are now equals, although they do have the source code for Google Earth, of course.

Re: kml:description considered harmful

Author: Sean

Wow, that change request procedure is worse than a lashing. Is it so to discourage trivial requests? It might be less work to patch libkml. I downloaded the form and the empty (!) instructions, got started on it, but editing the KML Word file (!) will have to wait a week or so.

Re: kml:description considered harmful

Author: mpd

I didn't say that it was a good process, just an open one.