How do I get my data into MapServer? Often, new MapServer users have a large legacy dataset
that is not in a supported format. This might be a CSV file, a MySQL
database, or a proprietary binary format. An overlooked solution to this
problem involves WFS (Web Feature Service), the closest thing
MapServer has to a generic data source. A Web Feature Service
responds to requests for features with a XML document that contains
features and attributes encoded using GML, the Geography Markup
Language. There are a number of full-featured WFS, commercial
and open source, but it is remarkably easy to roll your own that can serve as a lightweight MapServer data source.
Let's look at the simple case of publishing via WFS the data in a test CSV file. Granted, this is not the most compelling case, but is easy to
reproduce and a good starting point. Consider a file with a header
row and six point feature rows:
"FID:int","FNAME:string","SCORE:float","X:float","Y:float"
1,"Point A",0.2,-110.5,44.3
2,"Point B",0.4,-106.7,37.9
3,"Point C",0.15,-109.5,41.6
4,"Point D",0.6,-108.1,42.5
5,"Point E",0.8,-107.4,39.8
6,"Point F",0.55,-108.6,38.2
Each feature has an integer ID, a name, a float type score attribute,
and, as is often the case with legacy or non-GIS data, coordinates
as separate x and y values.
I will render this into XML using ElementTree, a flexible module
for programming hierarchical data structures. The reading of data,
and creation and arrangement of elements will be handled by a CSVFeatureCollection class. A toxml() method will return a WFS
feature collection encoded as an XML string, and be used as the
core of a response to a GetFeature request. To keep the example as simple as possible, I will ignore all WFS request parameters; this cheap-o WFS returns every feature from the data source. Because
MapServer is a simple and forgiving WFS client, I can also neglect
the GetCapabilities and DescribeFeatureType responses
required of a full-featured WFS.
Since the WFS response makes much use of qualified names for element tags, I
first define two functions for convenience:
# begin module csvsource.py
import csv
import elementtree
from elementtree.ElementTree import Element, QName, tostring
# Convenience wrappers for the Element factory
# --------------------------------------------
def WFSElement(tag):
e = Element(tag)
e.tag = QName('http://www.opengis.net/wfs', e.tag)
return e
def GMLElement(tag):
e = Element(tag)
e.tag = QName('http://www.opengis.net/gml', e.tag)
return e
Create the root element, defining the required WFS and GML
namespaces, and creating the element which describes the bounding
box for features in the response. One might use mapscript or the
OGR Python module to compute the proper extents. In the interest of simplicity, I'm taking a shortcut and hard-coding the bounding box:
class CSVFeatureCollection:
"""Express the 'points.csv' file as a WFS Feature Collection"""
def toxml(self):
"""Return an XML string, sans prolog"""
# root element
e_root = WFSElement('FeatureCollection')
e_root.attrib['xmlns'] = 'http://localhost/myns'
e_root.attrib[QName('http://localhost/myns', 'wfs')] \
= 'http://www.opengis.net/wfs'
e_root.attrib[QName('http://localhost/myns', 'gml')] \
= 'http://www.opengis.net/gml'
# bounds
e_bounds = GMLElement('boundedBy')
e_box = GMLElement('Box')
e_box.attrib['srsName'] = 'EPSG:4326'
e_coords = GMLElement('coordinates')
e_coords.text = '-120,30 -100,50'
e_box.append(e_coords)
e_bounds.append(e_box)
e_root.append(e_bounds)
Next, open the data file and iterate over its rows, producing and
attaching new elements, and then returning the XML string:
# continued ...
# open data file and iterate over rows
f = file('points.csv', 'r')
reader = csv.reader(f)
headers = reader.next()
self._fields = [(h.split(':')) for h in headers]
for row in reader:
if not row:
break
e_memb = GMLElement('featureMember')
e_type = Element('points')
# member bounds
e_bounds = GMLElement('boundedBy')
e_box = GMLElement('Box')
e_box.attrib['srsName'] = 'EPSG:4326'
e_coords = GMLElement('coordinates')
e_coords.text = '%s,%s %s,%s' % (row[3], row[4], row[3], row[4])
e_box.append(e_coords)
e_bounds.append(e_box)
e_type.append(e_bounds)
# member properties
e_ptprop = GMLElement('pointProperty')
e_point = GMLElement('Point')
e_point.attrib['srsName'] = 'EPSG:4326'
e_coords = GMLElement('coordinates')
e_coords.text = '%s,%s' % (row[3], row[4])
e_point.append(e_coords)
e_ptprop.append(e_point)
e_type.append(e_ptprop)
# other props
for i in range(3):
e = Element(self._fields[i][0])
e.text = row[i]
e_type.append(e)
e_memb.append(e_type)
e_root.append(e_memb)
f.close()
return tostring(e_root)
This module could be adapted to a number of different data sources.
Instead of reading from a CSV file, one might make a database query,
or fetch and parse RSS files.
The result of executing:
import csvsource
csvsource.CSVFeatureCollection().toxml()
is:
<?xml version="1.0" ?>
<ns0:FeatureCollection xmlns="http://localhost/myns"
ns1:gml="http://www.opengis.net/gml"
ns1:wfs="http://www.opengis.net/wfs"
xmlns:wfs="http://www.opengis.net/wfs"
xmlns:ns1="http://localhost/myns">
<ns2:boundedBy xmlns:gml="http://www.opengis.net/gml">
<ns2:Box srsName="EPSG:4326">
<ns2:coordinates>-120,30 -100,50</ns2:coordinates>
</ns2:Box>
</ns2:boundedBy>
<ns2:featureMember xmlns:gml="http://www.opengis.net/gml">
<points>
<ns2:boundedBy>
<ns2:Box srsName="EPSG:4326">
<ns2:coordinates>-110.5,44.3 -110.5,44.3</ns2:coordinates>
</ns2:Box>
</ns2:boundedBy>
<ns2:pointProperty>
<ns2:Point srsName="EPSG:4326">
<ns2:coordinates>-110.5,44.3</ns2:coordinates>
</ns2:Point>
</ns2:pointProperty>
<FID>1</FID>
<FNAME>Point A</FNAME>
<SCORE>0.2</SCORE>
</points>
</ns2:featureMember>
...
</ns0:FeatureCollection>
This is close, but it is not yet going to work with MapServer. The XML
Namespace specification states that applications should use the
element namespace URL not the tag prefix, but this detail is
handled improperly. MapServer looks for tags with 'wfs' and 'gml'
prefixes as it parses the response and does not recognize the prefixes
generated by our root element's writer.
The solution, easy to do with Python, is to monkey-patch the ElementTree module. I simply add some namespace and prefix
mappings to the module's well-known namespaces immediately
after it is imported at the top of csvstore.py:
import csv
import elementtree
from elementtree.ElementTree import Element, QName, tostring
# Monkey Patch adds to the default well known namespaces
elementtree.ElementTree._namespace_map.update({
'http://www.opengis.net/wfs': 'wfs',
'http://www.opengis.net/gml': 'gml'})
This gives us what MapServer expects:
<?xml version="1.0" ?>
<wfs:FeatureCollection xmlns="http://localhost/myns"
ns1:gml="http://www.opengis.net/gml"
ns1:wfs="http://www.opengis.net/wfs"
xmlns:wfs="http://www.opengis.net/wfs"
xmlns:ns1="http://localhost/myns">
<gml:boundedBy xmlns:gml="http://www.opengis.net/gml">
<gml:Box srsName="EPSG:4326">
<gml:coordinates>-120,30 -100,50</gml:coordinates>
</gml:Box>
</gml:boundedBy>
<gml:featureMember xmlns:gml="http://www.opengis.net/gml">
<points>
<gml:boundedBy>
<gml:Box srsName="EPSG:4326">
<gml:coordinates>-110.5,44.3 -110.5,44.3</gml:coordinates>
</gml:Box>
</gml:boundedBy>
<gml:pointProperty>
<gml:Point srsName="EPSG:4326">
<gml:coordinates>-110.5,44.3</gml:coordinates>
</gml:Point>
</gml:pointProperty>
<FID>1</FID>
<FNAME>Point A</FNAME>
<SCORE>0.2</SCORE>
</points>
</gml:featureMember>
...
</wfs:FeatureCollection>
All that's left is a means to serve this data. One might use a Python
CGI script and Apache, or use the code above in a Zope external
method that is served to the web via Python Script. Instead, for extra
flavor, I'm going to use twisted.web and write a Python resource
script to answer GetFeatures requests:
# begin features.rpy
from twisted.web import server
from twisted.web.resource import Resource
from twisted.internet import reactor, defer
import csvstore
class Features(Resource):
def get_data(self):
d = defer.Deferred()
reactor.callLater(0.25, d.callback,
csvstore.CSVFeatureCollection().toxml())
return d
def render_GET(self, request):
"""a default method called when the object is published"""
request.setHeader('Content-type', 'text/xml')
request.write('<?xml version="1.0" ?>')
d = self.get_data()
d.addCallback(lambda x:(request.write(x+'\n'), request.finish()))
d.addErrback(lambda x:(request.write(x), request.finish()))
return server.NOT_DONE_YET
render_POST = render_GET
resource = Features()
The entire WFS and data consists of three files: points.csv, csvstore.py, and features.rpy, in a directory such as '/var/www/cheapowfs'. It can be put online using Twisted's
mktap and twistd utilities:
mktap web -p 8080 --path /var/www/cheapowfs
twistd -f /var/www/cheapowfs/web.tap
All files: cheapowfs.tar.gz
This WFS can be accessed by MapServer using a layer definition
like the one below:
LAYER
NAME "cheapo_wfs_points"
TYPE POINT
STATUS DEFAULT
CONNECTIONTYPE WFS
CONNECTION "http://localhost:8080/features.rpy?"
METADATA
"wfs_service" "WFS"
"wfs_typename" "points"
"wfs_version" "1.0.0"
END
LABELITEM "FNAME"
PROJECTION "+init=epsg:4326" END
END
Hobu and I will be expanding on this example in our Python workshop
this summer, and I'll be back to it with a few more interesting examples
in the next few weeks.
Comments
Re: Discount WFS Source for MapServer
Author: Myunghwa Hwang
Sean, would you mind letting me know about writing a general python cgi script which can serve as features.rpy? This example is what I looked for, except features.rpy. from Myunghwa HwangRe: Discount WFS Source for MapServer
Author: Sean
The quick and dirty CGI would be something like this: ... print 'Content-type: text/html\n\n' print '' print csvstore.CSVFeatureCollection().toxml()