So just how “open” is open data?

A recent article by David Eaves in Slate entitled “Lies, Damned Lies and Open Data” nicely summarizes one of the emerging challenges that face advocates of “open data.” (For those unfamiliar with the term, the open data movement urges government to make information available through “open data portals that share . .  information like budgets, product recalls, factory pollution levels, and crime data”

Opening data to the public in this way is an inherently democratizing process.Making raw data fully accessible allows anyone to gain and publish creative insights that would be impossible otherwise.  A couple of recent examples of data visualizations that I have created using public data are here and here.

So open data is a good thing, a solution to the pressing need for valid information to guide problem-solving and decision-making.

However, the problem with most solutions is that they create a new set of problems. Or, better said, the emergence of open data doesn’t so much create problems as it uncovers a set of problems that has always been present, but now grows even more apparent. Specifically stated, who gets to decide what to measure in the first place, and how are the data to be interpreted?

Eaves offers as an example of this problem a law recently passed in North Carolina that requires that only “historical data” be used to predict future sea levels. This law—adopted under pressure from the state’s powerful coastal property developers—precludes scientists from using the scientific method to reach scientific conclusions about a scientific problem, resulting in an estimated 12-inch increase by 2099. This legally constrained estimate is approximately 70 percent below the 39-inches projected by scientists using more sophisticated and generally accepted climate modeling data and techniques.

The point of this example is not to argue for or against climate change models, but rather to point out the danger of legislating ideological (or financially) motivated constraints upon the process of discovery. Science simply cannot function if data is squeezed through conscious or unconscious agendas – and that’s the danger that the open data movement is starting to bring into the open.

I’ll let Eaves have the last word: “Open data does not represent an endgame, but another step in what will likely be a never-ending struggle for rational debate and evidence based public policy.”