Wednesday, September 11, 2013

What is an Open Data Standard?

As I mentioned in my previous post, there is far less disagreement -- in my experience -- in Utah legislative circles over "transparency" and "openness" than sometimes seems the case.  Often what sounds like disagreement results from talking 'around' each other (in this case data geeks, activists, and lawmakers) when discussing these ideas.

One of the first things I learned in the process that led to SB283 and the Transparency Advisory Board's new tasks was that when I said "Open Data Standards" I got blank stares, but when I said "format standards and consistent practices," I got nods.  So what is open data?

As The Open Data Handbook defines it:
Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.
Added to that definition are the concepts and attributes of data that can be intermixed with other data and other systems to maximize the usefulness of data in discovering better understanding, services, and even products.

There are easy go to examples of the benefit of open data or "format" standards.  One of my favorites to tell is a city, right in your back yard, where PDFs were being printed for long term storage.  When a request for any of these PDFs was made, the files were scanned and emailed or delivered on a disc.  Great example of where a "format" standard would save some time and money, right?  And probably just the tip of the iceberg statewide.  But this type of story alone (and there are many) doesn't fully grasp the importance of Open Data Standard policy.

During the legislative session, Sen. Henderson spoke about "one stop shopping" for public access to public data in her Senate floor speech before the first vote.  The concept being better format standards from all levels of government (eventually... our conversations stayed limited to state agencies, to keep the scope in check) producing better and more easily "intermixed" data, and the ability to make that data accessible for reuse and redistribution via a single online portal.  Additionally, data retrieved from this portal could then be "intermixed" and reused in countless ways by the end user. 

So far then, we've covered formatting standards (efficiency, consistency, longevity, and "intermix" effect), and building a portal (ease of access, "one stop shopping").  And the best part?  I learned during the GRAMA Work Group study that due to some foresight, planning, and even luck, Utah is in a perfect position with already existing technology at most state government levels to put this concept into action now.

But there's still one final piece to include for fully understanding Open Data Standards and their role in managing and accessing public data in Utah.  Okay, honestly, there are dozens more pieces to the puzzle.  Just a few from the Sunlight Foundation's Open Data Guidelines publication (just updated, but we drew heavily on version 1.0 in writing SB283): Safeguarding private data, provisions for contractors and quasi-government agencies, publishing in bulk when possible, just to name a few.  But one specific step Utah could take upon recommendation from the TAB and legislative approval: publishing code.  From those same Open Data Guidelines:
Not only the data, but the code used to create government websites, portals, tools, and other online resources can provide further benefits, as valuable open data itself. Governments should employ open source solutions whenever possible to enable sharing and make the most out of these benefits. The Consumer Finance Protection Bureau (CFPB) began publishing open code on the social code site GitHub in 2012, citing that doing so helped them fulfill the mission of their agency and facilitated their technical work. (More information is available in the announcement blogpost on the CFPB’s website.)
Removing the "gatekeepers" from code from tools and online resources opens the doors is where easily accessible, consistently formatted public data can really take off.  It's a very limited example, but recently someone at a Utah company, in their spare time, used Sunlight Foundation API code shared on their webpage to pull data from and build a highly customizable legislation tracker that could even be manipulated to send you reminders on your smartphone.  Now imagine if someone with very limited coding skills, working in real estate, manufacturing, the NSA building in Bluffdale... okay bad example, let's just say any industry or organization in Utah could access public data and public data manipulation and presentation code, and turn it into whatever they want or need?  The possibilities are endless, and little explored to date.

The New York Times called this discovery of uses for public data in both government and private markets the "Moneyball" revolution: 
The story is similar in fields as varied as science and sports, advertising and public health — a drift toward data-driven discovery and decision-making.
Formatting consistency (efficiency, intermix), one stop shopping (single portal), and access to data and code for tools (innovation, multi-use), while not a full picture, are a great starting point for understanding what an Open Data Standard is, and why it's important.

Next up, what the TAB shouldn't do.

Recommend reads:
- The full Open Data Handbook.
- Sunlight's How to Implement Open Data Policy (with references to SB283 and Utah's TAB!)

1 comment:

  1. The tools seem endless,
    Smartphone reminders of when a property in your search parameters appear in a search area, or the example of legislative changes resulting in notification on the smart phone.
    Is this what you're discussing in the possible uses of open data standard?