Mark Logic XML Server Adds Geo Support
There’s a bit of an invasion into geospatial. A few weeks ago Netezza and Teradata, two data warehouse appliance companies added geospatial support to their offerings. These sorts of appliances were new to me and per our recent poll new to many of you. Today another type of company comes invading. Mark Logic has added geospatial support its XML Server (press release). So, just as last time I had to ask a seemingly naive question, "What’s a data warehouse appliance?" this time I have to ask, "What’s an XML Server?" An XML server is a specialized database for XML that’s "wired" like a search engine. That’s how Mark Logic’s Chief Technologist Chris Biow explained it to me yesterday. On top of the database is a layer that allows easy exposure of Web (and other) services.
In practice, what you do is load up your XML data directly. Non-XML data (PowerPoint, spreadsheets, PDFs, etc.) are loaded via converters into the database. Note that at in Microsoft Office 2007, all documents are stored in XML natively, so there’s no conversion needed for those. The database creates appropriate indices and you can query it simply and quickly. The "big deal" here is that you can query "pieces" of a document, a graphic, rather than a whole PPT as one unit. That’s what XML is about, in part, breaking down data into smaller, tagged bits. Now, instead of good old SQL queries, with an XML database you use something called Xquery, the XML query language.
So far, so good? Now, just like database vendors added spatial support to their relational databases, the folks at Mark Logic have added, due to existing customer demand, spatial support for its XML server. The good news? There’s lots of XML formated geodata available today: GML, KML, GeoRSS, etc. Mark Logic supports 20 geo xml formats and can pretty much ingest any geodata. Moreover, if your documents or their content are not geotagged yet, the company can provide tools to do a simple tagging with a gazetteer or work with a company like MetaCarta to do the tough work of teasing out the meaning of natural language descriptions of location ("five miles east of Paris").
With geodata loaded up, content and geospatial queries can be done: "Find all the businesses along this ten mile buffer around this route," for example. But because the data is stored in XML and the database is wired for search, results can be quite speedy. One client had a relational spatial database solution with 24 schemas and it was "breaking." A query took 8-12 minutes. Once the data was loaded up into Mark Logic the time required dropped to 6 seconds. Now, since this is basically a database with an API, you can put whatever front end visualization solution you want upfront. Mark Logic is partial to an XML output and so we saw results in Google Earth and Google Maps, but any client could be used.
Directions Media is working with Mark Logic on an upcoming Webinar, so please join us to hear get information.
