Structured vs. Unstructured Data - Let’s be clear on definitions
We are going to be bantering around the words "structured" and "unstructured" data more frequently than in the past and while at the MetaCarta Public User’s Group Meeting today I wondered how many readers are familiar with the terms. Obviously, many are who are regular readers of APB and our more technical articles at Directions Magazine. But let’s be really clear and basic about the definitions:
Structured data: For example, New York is often represented as a geographic feature as a point on a map and in a digital database as a latitude and longitude in a row and column. In other words, we assume that the representation is explicitly defined as a geographic primitive.
Unstructured data: "New York" is more often written about in newspapers, email, XML RSS feeds, Word documents, etc. How do we know if the written word is referring to the city in New York state or a street in Redlands, California?
This is where MetaCarta technology as well as other companies are skilled at taking unstructured text and putting it into a different context. MetaCarta puts it on a map. So, when you see references to these terms make sure you understand how they are being used. If you don’t like my definitions, see at least the Wikipedia definition of unstructured data [corrected link per comment]. These terms will become commonplace as the world of data warehouse appliances, business intelligence, and specialized servers creep into the geospatial domain.
[Disclosure: MetaCarta paid for travel and expenses to their user group meeting]
