On the value of open data

Wednesday, October 26, 2016

by Kent Aitken RSS / cpsrenewalFacebook / cpsrenewalLinkedIn / Kent Aitkentwitter / kentdaitkengovloop / KentAitken


I spent the first week of October not going to the International Open Data Conference in Madrid. We lucked out, hosting it in Ottawa last year - it's the annual meet-up, stock-take, and direction-setting for a potent concept in modern governance (that is, not a new concept, but modern from an implementation point of view). Namely, that governments should be releasing the research, data, and briefing documents that underpin public decisions. When we specifically refer to open data, it's about the datasets - rows and columns of values - that governments collect, create, and draw on, and releasing them in machine-readable formats.That is, a stack of paper doesn't work; people outside government get to work with, crunch, and analyze the data too.

Alex Howard, a stalwart of the open data scene who works with the Sunlight Foundation, laid out a fantastic overview of both the conference and the state of the open data "movement." I have one point of disagreement, though. Howard writes:

"the deadline for more evidence is getting close. Politicians will always question transparency, which puts a premium on demonstrating why it matters in terms that the public understands and can apply in their lives." 

I instead see two possible routes for the future of open data. One is the value route, where governments firmly decide that it’s worth the cost and effort (through research, case studies, surveys, and notoriously hard-to-pin-down usage data). The second route would be governments start to talk about it like Access to Information (ATI) laws or, in Canada, Official Languages. The framing here becomes about obligation, expectation, and legal duty - not a cost to be debated. We don't concern ourselves with the pros and cons of releasing things in both official languages; we do it because it's just what we do.

And yes, there's a cost to putting open data on that same plane of existence, and it involves re-tooling decades of information management systems to get there. In the meantime, however, we have the most expensive model: internal information architectures, open data registries, and Access to Information flows. We'll always have all three, but right now many documents will exist in each system, where the long-term for open data is minimizing the duplication and triplication. 

Right now the lifecycle for records includes internal use followed by eventual release, sometimes being requested via ATI in between. Once documents and data start going straight to the open-to-the-public storage solution, a couple things will happen:
  1. The costs will go down
  2. Analysts in other parts of that same government will be able to find and use that data and information sooner (including knowing that it exists in the first place)
  3. When people work in government want to work with those outside, they can simply link to the already-open data they're working with (this is one of the ways open data overlaps with open dialogue and citizen engagement)