For 2011

A few of my goals, not resolutions, for 2011:

  • Improve my Vim skills.

  • Brush up on JavaScript programming. Don't get left behind on the server side.

  • Make some headway on a framework for analysis and inference of fuzzy and relative locations. Apply to Pleiades in particular.

  • Gain more experience with NLTK and text corpora and computational linguistics in general. Apply to Pleiades in particular.

  • Write more articles, particularly about the previous two goals.

  • Spend more time outside. I'm happier outside.

  • Introduce my kids to the Colorado Plateau.

  • Resume making beer at home. I've made almost 50 batches of beer, but none since 2004.

  • Maintain my French skills, mainly through music, movies, novels, bande dessinée; letters to friends, and to our French bank; and learning about tax refunds.

  • Dabble in charcuterie. Colorado is flowing over with good beer, but not so much with cured meat unless you count beef jerky.

  • Get more involved in the Fort Collins community.

Comments

Re: For 2011

Author: Chris

Great goals Sean - I particularly like the vim, javascript, and Colorado Plateau ones!

Have a good year, I'm sure we'll drink some beer at point this year.

Chris

Re: For 2011

Author: Bill

Great goals. As for the last one, you'll find that kids have a way of leading you into that. Enjoy your upcoming year!

Bill

Re: For 2011

Author: Norman

as for charcuterie, if you need your fix of saucisson, then Longmont Cheese Importers has the lot. I get my fix of english things from there (e.g. Plum Pudding) when we need to :) NLTK looks interesting.

Re: For 2011

Author: Gaurav

Hey

I keep visiting your blog as my research is close to what you are doing...was happy to see that there is Natural language processing (NLTK) involved in your work...i am also working with NLP tools and there are few goodone available as well GATE, Stanford CoreNLP can also be used...

Best wishes and Good luck for your goals!

Gaurav

Re: For 2011

Author: Adrià

As for the BD side, I'm sure you may already know them, but let me recommend you Christophe Blain (last one is brilliant: http://goo.gl/5uhoO) and Joan Sfaar (http://goo.gl/X07r9). I always enjoy their work.

Re: For 2011

Author: Sean

Thanks for the recommendations, Adrià. I recently read Sfar's "Chasseur-Cueilleur" and have been meaning to start on another series.

Versioning in Pleiades scripts

In preparing for a few rounds of large Pleiades coordinate updates in the next few weeks, I've been rediscovering zopectl's run command. It lets you run a script in the context of your database:

$ zopectl run script.py

much like psql (for example) lets you run a script on your Postgres database:

$ psql -f script.sql

The main differences are that with zopectl your script is Python, not SQL, and that the data is a arbitrarily large and nested mapping bound to the name app instead of tables. Imagine your dataset is a giant JSON structure: that covers much of what developing with the ZODB (Zope object database) is like. The ZODB is stored as an append-only B-Tree and enjoys many of the benefits explained in the CouchDB Guide. The essential similarities and differences between ZODB and CouchDB were summarized two years ago by Chris McDonough here.

A key feature of Pleiades is resource versioning. There's a repository tool in which we track changes to places, names, and locations; it even provides diffs between versions. I found out before the holiday that a Plone developer (Pleiades uses Plone 3.2, and your mileage may vary) has to take a few extra steps to setup supporting utilities for the repository tool in the case of an offline script.

from Products.CMFCore.utils import getToolByName
from Products.CMFUid.interfaces import (
    IUniqueIdGenerator, IUniqueIdAnnotationManagement, IUniqueIdHandler)

from zope.component import provideUtility

def setup_cmfuid(site):
    provideUtility(
        getToolByName(site, 'portal_uidgenerator'), IUniqueIdGenerator)
    provideUtility(
        getToolByName(site, 'portal_uidannotation'),
        IUniqueIdAnnotationManagement)
    provideUtility(
        getToolByName(site, 'portal_uidhandler'), IUniqueIdHandler)

Now I call this function in my script and the repository tool is ready for use. The login() function makes sure that all changes are made as the specified user (me, in this case) and bulk_update_locations() updates the coordinates from a number of CSV records within a single transaction, annotating each version change with a message.

from pleiades.bulkup import login, setup_cmfuid

site = app['plone']
setup_cmfuid(site)
login(site, user)
bulk_update_locations(site, reader, columns, message)

Mikko Ohtamaa's Command-line interaction and scripting helped me learn how to set the effective user (or login), but didn't explain why the 3 Plone tools in my case weren't being automatically registered.

In which I try to help more and complain less

Update (2011-01-12): Better: http://sgillies.net/blog/1065/thats-more-like-it

A reader suggested to me in comments that I should approach the words on the FOSS4G 2011 site (which I quoted previously) from a different angle and try to appreciate that they're aimed at folks outside the FOSS4G community. Speaking to them using their own language including, apparently, their own unflattering stereotypes of open source users in an anything goes effort to lure them in to the event. I suppose this makes some twisted sense if you ignore an eventuality: that this corporate audience will at some point discover themselves fully surrounded by these very same wild-eyed open source hippy freetard philosophers. But nevermind that for now, I'm fully onboard with the FOSS4G rhetoric and have some other stereotypes that the marketers are free to use or repurpose if the original

Many early adopters of FOSS solutions chose them based on "philosophical" reasons, but ...

starts to wear thin. How about this one, which uses unnecessary quotes to maintain the right style?

Many early adopters of FOSS solutions were into GIS as a "fun" hobby, but ...

That will have strong appeal for certified professionals.

Many early adopters of FOSS solutions felt "everything" should be free, but ...

The conference is coming back to the USA, and nothing is more American than using Communists as bogeymen.

Many early adopters of FOSS ranted about the "difference" between freedom and free beer, but ...

Always with the damn philosophizing, those freetards.

Many early adopters of FOSS were from "academic" backgrounds, but ...

I almost forgot the Ivory Tower! Nobody is less pragmatic than an academic researcher, right? Hope this helps, and Merry Christmas.

Copywriting run amuck?

Bullshit.

Recent years have seen substantial changes in the geospatial industry. One of those changes has been the growth in maturity and adoption of free and open source solutions. Many early adopters of FOSS solutions chose them based on "philosophical" reasons, but increasingly large enterprises and government organizations are choosing these solutions for pragmatic business reasons. In many cases organizations are using a mixture of open and closed source solutions.

Emphasis above is mine. What the hell is up with the scare quotes and the impractical, ideology-driven, early adopter strawman? If you were there in the early FOSS4G days, you'll remember that it was in fact about pragmatic solutions: proprietary software often lacked features that we needed to do our jobs or was riddled with bugs that we suspected wouldn't be fixed any time soon. The need to get shit done was the driving force behind GDAL, MapServer, PostGIS, and friends. Driving a stake through the heart of ArcIMS was just gravy.

Update (2011-01-12): Better: http://sgillies.net/blog/1065/thats-more-like-it

Comments

Re: Copywriting run amuck?

Author: Karsten Vennemann

Seems that the quote above is straight from the foss4g 2011 home page...

Certainly a driving force behind the OS GIS solutions was getting things done, but functionality comes, matures and is added on over time as we all know ;) . I can see in the small GIS consulting business that I am running a huge and growing client demand for OS web based solutions unfolding over the last 4 years. I have seen that OS web based solution already where competitive over the last 5 years when I have been using them. However with OS desktop GIS that increasingly became true over the last 2 years when e.g. QGIS and gvSIG became so powerfull that they actually can be used to do most of I was doing in ArcGIS previously (namely fine cartography and labeling).

Also it's not all about the technological solution (and what it can do) but about information about the solution, about spreading the word, and the process that this information finally arrives (trickles through ) at decision makers desks (long way) - that is in other words (due successful foss4g) 'marketing' (yuk ?).

On the foss4g web site I see this quote also as a perception described and I don't see anything wrong with that. I think the home page is also largely about spreading the word to non - typical OS GIS users (yes not only geeky OS developer types that come to foss4g anyway thanks god!) to get them and come to the FOSS4G. Personally I would love to see more regular GIS users (vs developer types) to attend foss4g and especially lots of newcomers (yes ESRI type GIS users) to join this event !

Note: I am not involved in creation of the home page at all ;)

Re: Copywriting run amuck?

Author: Sean

Karsten, I am not disputing the undeniable "growth in maturity and adoption". I'm disputing the characterization of early adopters -- the people who put PostGIS (for example) into production long before it hit 1.0, who found and saw closed all the early show-stopping bugs, who supported each other before there were books or adequate documentation, who raised the damn FOSS4G barn -- as unpragmatic has-beens. The mainstream owes the early adopters respect, if not thanks.

Re: Copywriting run amuck?

Author: bk

Tuff in the corners, sweet hands in front of the net.

Gillies is the Chardonnay of Men..

Re: Copywriting run amuck?

Author: Sean

Sadly, I am not, to my knowledge, related to Clark Gillies.

Re: Copywriting run amuck?

Author: Paul Ramsey

Agreed. Unless "unafraid of change" or "open to new approaches" count as "philosophical" reasons. Sure, it takes a different mindset than your bog standard IT manager might have, but ideology doesn't enter into it.

Re: Copywriting run amuck?

Author: Daniel Morissette

This sentence on the FOSS4G 2011 page also hurt my eye. I consider myself one of those early adopters but as a consultant my motivations have always been and still are mostly pragmatic and not philosophical.

I don't think this descriptions matches many FOSS4G user organizations either (not our own clients anyway, not even those from the early days). All our clients care about is that the tool gets the job done. Very few care about the "philosophical" reasons and it's actually the contrary: if you want to scare a potential client away then start giving them a "philosophical" lecture.

XML vs the "GeoWeb"

James Clark, Technical Lead of the XML working group and author of expat recently wrote an interesting post about XML, JSON, and web development titled "XML vs the Web":

So what's the way forward? I think the Web community has spoken, and it's clear that what it wants is HTML5, JavaScript and JSON. XML isn't going away but I see it being less and less a Web technology; it won't be something that you send over the wire on the public Web, but just one of many technologies that are used on the server to manage and generate what you do send over the wire.

This isn't JSON triumphalism; Clark has well expressed mixed feelings about the turn.

If our pre-WhereCamp5280 meetup is indicative of how cutting edge developers are evaluating and selecting technology, JSON is definitely the future of sending features to geospatial apps running in the desktop, laptop, or mobile browser. But there's more to the web than browsers. I get a little concerned sometimes that going all in for JSON and HTML5 tech like WebSockets takes us even further off the web as integration platform than we were with GML and W*S.

Comments

Re: XML vs the

Author: Nelson Minar

I think the key issue here is mixed content. If you just want to pass some data around, lists of numbers and the like, JSON is obviously a better choice. I think most geo applications are more like data than documents. Certainly XML offers nothing to help work with raster and vector data. Put another way: the GeoWeb is about Javascript and HTTP, but very little about XML or HTML.

Re: XML vs the

Author: Peter Rushforth

JSON is a great data transmission format, but is it hypertext? At its core, the (geo)web is about hypertext. I'm not sure that mixed content is a big issue: HTML does a fine job at that. In a geoweb context, HTML does not (yet) measure up, and I haven't seen other formats stepping up to that challenge either, with the possible exception of Geo Atom. The only problem with Geo Atom is that the geo part is badly fragmented with defacto, dejure and ad hoc encodings -sigh-.

In the end a lot of information exchange today has to be optimized for small devices and narrow bandwidth (maybe that will always be the case). Maybe XML isn't the best tech for that.

Yet, just as DC didn't transmit well, it is still important inside devices.

Post-GIS Day 2010

Tomorrow, November 18, is the date for the pre-WhereCamp5280 open source dynamic language geospatial programming hackfest/unworkshop. Please do sign in (or out if you must) here – we'll get to see more of Chris if he doesn't have to scramble to find a bigger room tomorrow morning. It's fun that this ended up being scheduled on the day after "GIS Day"; A bunch of us are definitely going to be working on what comes next after GIS.

For extra adventure, I'm going to attempt something I've never done before: travel from downtown Fort Collins to downtown Denver using only my legs and public transportation. In theory, Transfort/Flex and RTD connect in Longmont, and I've received some confirmation of this and other good tips from Front Range riders. We often went by car last year in France, but also did a lot of inter-city train trips and bus trips around the Montpellier Agglomeration. Other than riding to DIA, I've never done this kind of bus trip in Colorado. It will be interesting to compare the experiences.

Comments

Re: Post-GIS Day 2010

Author: Michael

Years ago I noticed that it would be possible to get from Vancouver to Seattle entirely on public transit. If I remeber correctly, the trip required 3 or 4 transit agencies about 6 hours and one of the transfers required walking across the US border but everything did connect up for less than $10 one way. maybe I should actually try it one of these days.

Re: Post-GIS Day 2010

Author: Sean

The trip turned out to be pretty painless, if long. The Flex is a standard city bus and gets one to downtown Longmont for $1.25. There were 5 of us on the southbound morning bus. The RTD LX to Denver was a coach and full when we left Longmont. $4.50 for that ride. $5.75 and roughly 3 hours each way. Cheaper than solo driving and parking, but too time consuming for a one-day trip. I'd never do this to take my kids to the Natural History Museum, for example.

Explaining Pleiades

I tried a new approach in explaining Pleiades last week:

Are you familiar with "Google Places"?

Yeah.

Pleiades is like that for the ancient world.

Cool.

The analogy is only a little bit bogus, and I'm going to try explain that now. Google's Places (and Facebook's and Foursquare's) are about linking the concept of being at a store, or venue, or landmark with advertisements, reviews, deals, reservations, media, nearby places, resources (including maps) that reference them – largely for commercial purposes. Resources that link to other resources. Pleiades places, on the other hand, are about linking the concept of studying an ancient city, road, or region with nearby places and references to supporting articles, books, maps, digital editions of inscriptions, and archaeological GIS. These are scholarly, not commercial nodes in the web.

That's not to say that there is no data in commercial location sites worthy of research or that there's nothing in them worth emulating. While I'm not personally interested in "playing" Foursquare, Gowalla, or the like, I think we'd be completely off track if Pleiades didn't help make something like a visualization of Alexander checking in, ousting mayors, and gaining badges across Asia trivially possible.

What's an Un-GIS?

In a abstract just submitted to Digital Humanities 2011, I labeled Pleiades an "Un-GIS". I feel it's important for users and watchers in the humanities, which is going gang-busters for GIS technology, to understand the differences between Pleiades and a ESRI geodatabase, an OGC-style feature/map service, or a conventional digital gazetteer. I don't think it's useful to try to precisely define "Un-GIS", but here are a few qualities that I think distinguish Pleiades from a typical geographic information system or spatial data infrastructure:

  • Aggregation of temporally varying features into conceptual places or spaces that reflect ancient practice or modern scholarly method.

  • Rich toponyms with ancient spellings, transcription details, temporal scope, and links to primary sources and scholarly literature.

  • Identification and representation of geographic features that have no known locations, or that can be located only vaguely, roughly, or in relationship to each other.

  • Embrace of the uneven distribution in quality and density of data that is inherent in ancient world studies.

  • Embrace of web architecture.

Pleiades distinguishes between place and space [1]. In writing "Ephesus demonstrates the potential complexity of ancient Mediterranean urban centers" (an example from the DigitalClassicist wiki), a scholar would not be referring to the coordinates of the footprint of Ephesus, but to a historical entity and also the body of work of which it is the subject.

Primacy of space might be the defining characteristic of GIS, but it's the names of places, not their coordinates, that occur in ancient texts and inscriptions. We model toponyms carefully so that we'll be able to serve researchers mining ancient texts for new insights into ancient geography. We're even going to keep track of what the Barrington Atlas calls "false toponyms": place names attested to in ancient or modern works that are now considered to be erroneous. These include names from Avienus' "Ora Maritima" that are regarded to be not just wrong, but fictitious [2].

More common than these false toponyms are names for unlocated places such as Kritalla, the marshalling point of Xerxes' army [3]. More common yet are places with fuzzy or non-determinable boundaries like the territory of the Salluvii or the Aegean Sea. Pleiades can identify and represent places for which boundary lines would be misleading. The boundaries of the Roman province of Aegyptus shown on map sheet 100 of the Barrington Atlas, for example, are clearly noted by the editors as rough and approximate [4].

The compilers of GIS datasets (a population that once included me) usually aspire to uniform density and quality of data, and for good reason. The ancient world, however, doesn't give up its secrets like that. Data about it is spotty. Some places are truly lost, some are less accessible. Scholars, too, choose the places that interest them whether they fill in the gaps or not. Pleiades embraces the inherently unfinished nature of ancient studies; instead of waiting on precise coordinates from partners, we're rolling out places with approximate locations that will be refined, live, as we get better locations. I think we're lucky to be doing this now after Wikipedia and OpenStreetMap have completely changed the nature of content and geodata creation – not just in opening it up to non-experts, but in freeing data to be improved incrementally.

Throughout this post, I've provided links to Pleiades using the URIs of places. The representations at the other end themselves carry links to names and locations and neighboring places, and soon (we hope) to digital editions of ancient and modern works served from other domains. We take GIS seriously in Pleiades, but whenever there is a question of "what is the GIS approach" vs "what is the web approach" to a problem, we go with the latter.

Most of all, I offer "Un-GIS" as a starting point for interesting discussions at the conference about how GIS technology and methods do and don't directly apply to historical geography.

Comments

Re: What's an Un-GIS?

Author: Jeff Thurston

This is interesting.

I agree. GIS are either raster or vector model based and depend upon the geo-referenced locations of geometry or pixels. This is a large area for study that GIScience has talked about, particularly human geographers who attempt to convey the full range of human behavior that does not neatly fall to a vector.

Eskimos and natives fall into the this group, often traveling long distances guided by inukshuks or landscape forms or even memories of past events (places where births, deaths, marriages happened). Farmers often know places upon their land by soil types. Foresters are guided by large trees or even places where wildlife congregate.

GIS related knowledge is like an iceberg and I would reckon that most of the wealth of what we do not know about (4/5ths) falls into the category you are defining - which is why it is so valuable.

Re: What's an Un-GIS?

Author: Gretchen

Interesting. I've always thought there was an absence of ability to deal with "fuzzy features" - as in features that are not exactly defined in geographic space but generally or relationally defined. Though at the same time I wondered if that would still qualify as GIS. Your un-GIS term captures that dilemma. An issue I've run into (and perhaps you've already touched on this) is dealing with ancient fonts - or rather, special characters in native written language.

Re: What's an Un-GIS?

Author: Sean

Thanks for the comments. What I'm writing about does verge on vernacular GIS, but it's mostly about being able to support careful, precise investigation of imprecise geographies. Contrary to a comment I received via Twitter, I think this approach differs from the mainly scale-free and highly personal "Neogeography", where a point is good enough for just about any event's location.

Gretchen, the situation for ancient map labeling is getting better all the time. A bigger problem for us is incomplete and varying form of place names; our ancestors were just as bad at spelling as we are and have sometimes been less than careful with their cultural heritage.

Re: What's an Un-GIS?

Author: Tim Hitchcock

This is a nice post, and gets at a real issue. But I wonder if you aren't down playing the extent to which the inflexible nature of GIS is being used self-consciously by humanists precisely in order to co-ordinate abstract ideas and categories, that otherwise resist definition. A point or a polygon (however fuzzy at the edges) allows you to relate two absurdly different things - 'ego' and 'roast pork' (perhaps through the nearest identifiable place, or the place of creation, or one of several other possibilities). The nice thing is that geo-referencing would give one answer to the relationship, which could then be compared to measures of the relationship drawn from other disciplines like quantitative linguists (precise measures of distance, for instance, or the nature of textual context).

I believe a lot of humanists are turning the GIS in frustration at the working out of the linguistic turn; and while adding an Un to the front may highlight the problems and distance you from the physical geographers, it might also underplay the opportunities.

Un-GIS, naming, and power

Author: Shane Landrum

I'm no classicist, but I think some of the principles you're describing have much broader applications.

  • Aggregation of temporally varying features into conceptual places or spaces that reflect ancient practice or modern scholarly method.
  • Rich toponyms with ancient spellings, transcription details, temporal scope, and links to primary sources and scholarly literature.
  • Identification and representation of geographic features that have no known locations, or that can be located only vaguely, roughly, or in relationship to each other.

As an Americanist historian, these features are what I need to map, for example, a set of letters from a particular rural crossroads/hamlet/ghost-town which no longer exists. Some of my colleagues would use them to map conflicting place-naming systems and sovereignty claims between indigenous people and European settlers.

GIS lets us map things we can claim certain and precise knowledge about, but the farther away one gets from a kind of hard-positivist thinking about space, the fewer uses I can see for it. There are large, important historical subfields built around critical approaches to conflicting/silent/missing sources, and I'd really like to see mapping software which doesn't grate against those subfields' central insights.

RFC 5988

Web Linking is RFC 5988. Via Assaf Arkin's Rounded Corners 258 I came on to an example (for Firefox and Opera) that assigns a stylesheet using .htaccess. This could also be used to turn "dumb" image files into hypermedia; a map tile JPEG could link (in its headers) to metadata, to neighboring tiles, to data for the features depicted within.

Update (2010-11-04): more from Erik Wilde.

An afternoon of hacking before WhereCamp5280

Geospatial programmers inclined toward interpreted languages and open source don't get together enough on the Front Range. Please help Chris Helms, Tyler Erickson, and I correct this unfortunate state of affairs on November 18 at the once Colorado Brewery, now Nationally Registered Historical Place known as CU Denver's Tivoli Student Center. WhereCamp once piggybacked on Where 2.0 and now we're hooky bobbing on WhereCamp5280. See the announcement by Chris on the FRUGOS list for details and spread the word.

Update (2010-10-26): Please sign up on the wiki if you're planning to attend.