Developing, Documenting and Sharing Dr. Niman’s H1N1 Flu Database
A comment to my post earlier this week on Mapping Swine Flu pointed me to Rhiza Labs, a company working (gratis) with Dr. Niman (bio) on mapping the spread of the disease. I spoke with Michael Higgins, Rhiza Labs CTO, who gave me a better handle on the data set, sharing that data, and the maps.
If you head to Rhiza’s FluTracker website (I noted the company when it launched its Insight product at Where 2.0 last year) you’ll see a different map using Dr. Niman’s data . Rhiza is periodically downloading the data Dr. Niman and team are collecting and putting them into the Insight system. Higgins was just doing that when I interrupted him…ooops. The Rhiza map cites sources: “The map was compiled using data from official sources, news reports and user-contributions” and uses a different categorization/symbology of the data: confirmed, suspected, negative and fatality. Higgins notes that the “user contributions” as he understands it are only from Dr. Niman.
Below the map are icons for an RSS feed of the data and a KMZ download of the data; Higgins noted that these sets are simple point location data with no attributes. Right next to them is a link to download the entire dataset in CSV format. More on the nature of the data in a second. Update 5/5: When I originally spoke to Higgins he noted the dataset was public domain, now it carries a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License and you must contact Rhiza for commercial use.
If you click on the map on the main page you’ll get a larger interactive map (basically you are in the Rhiza system, Insight that made the map on the front page). Now things get interesting! If you click on a symbol you’ll get the record(s) associated with the point(s). Click on fatality/status and you can see the full record - description, lat/long, many other attributes and what I’m interested in: Source URL - the source of the data. Now, not all the points, Higgins explained, especially those collected early on, have sources. If you like you can create a login (see top right of the page) and have access to comments and other capabilities.
Higgins has some other plans as time (the company is scrambling to do this work alongside actual paying work) and data permit including producing some animations of the cases over time and creating queries that show data only from “authoritative sources. Further, the company is creating snapshots so that it will be possible to go back to different points even as data is updated and corrected.
