Wednesday, September 11, 2013

What is an Open Data Standard?

As I mentioned in my previous post, there is far less disagreement -- in my experience -- in Utah legislative circles over "transparency" and "openness" than sometimes seems the case.  Often what sounds like disagreement results from talking 'around' each other (in this case data geeks, activists, and lawmakers) when discussing these ideas.

One of the first things I learned in the process that led to SB283 and the Transparency Advisory Board's new tasks was that when I said "Open Data Standards" I got blank stares, but when I said "format standards and consistent practices," I got nods.  So what is open data?

As The Open Data Handbook defines it:
Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.
Added to that definition are the concepts and attributes of data that can be intermixed with other data and other systems to maximize the usefulness of data in discovering better understanding, services, and even products.

There are easy go to examples of the benefit of open data or "format" standards.  One of my favorites to tell is a city, right in your back yard, where PDFs were being printed for long term storage.  When a request for any of these PDFs was made, the files were scanned and emailed or delivered on a disc.  Great example of where a "format" standard would save some time and money, right?  And probably just the tip of the iceberg statewide.  But this type of story alone (and there are many) doesn't fully grasp the importance of Open Data Standard policy.

During the legislative session, Sen. Henderson spoke about "one stop shopping" for public access to public data in her Senate floor speech before the first vote.  The concept being better format standards from all levels of government (eventually... our conversations stayed limited to state agencies, to keep the scope in check) producing better and more easily "intermixed" data, and the ability to make that data accessible for reuse and redistribution via a single online portal.  Additionally, data retrieved from this portal could then be "intermixed" and reused in countless ways by the end user. 

So far then, we've covered formatting standards (efficiency, consistency, longevity, and "intermix" effect), and building a portal (ease of access, "one stop shopping").  And the best part?  I learned during the GRAMA Work Group study that due to some foresight, planning, and even luck, Utah is in a perfect position with already existing technology at most state government levels to put this concept into action now.

But there's still one final piece to include for fully understanding Open Data Standards and their role in managing and accessing public data in Utah.  Okay, honestly, there are dozens more pieces to the puzzle.  Just a few from the Sunlight Foundation's Open Data Guidelines publication (just updated, but we drew heavily on version 1.0 in writing SB283): Safeguarding private data, provisions for contractors and quasi-government agencies, publishing in bulk when possible, just to name a few.  But one specific step Utah could take upon recommendation from the TAB and legislative approval: publishing code.  From those same Open Data Guidelines:
Not only the data, but the code used to create government websites, portals, tools, and other online resources can provide further benefits, as valuable open data itself. Governments should employ open source solutions whenever possible to enable sharing and make the most out of these benefits. The Consumer Finance Protection Bureau (CFPB) began publishing open code on the social code site GitHub in 2012, citing that doing so helped them fulfill the mission of their agency and facilitated their technical work. (More information is available in the announcement blogpost on the CFPB’s website.)
Removing the "gatekeepers" from code from tools and online resources opens the doors is where easily accessible, consistently formatted public data can really take off.  It's a very limited example, but recently someone at a Utah company, in their spare time, used Sunlight Foundation API code shared on their webpage to pull data from and build a highly customizable legislation tracker that could even be manipulated to send you reminders on your smartphone.  Now imagine if someone with very limited coding skills, working in real estate, manufacturing, the NSA building in Bluffdale... okay bad example, let's just say any industry or organization in Utah could access public data and public data manipulation and presentation code, and turn it into whatever they want or need?  The possibilities are endless, and little explored to date.

The New York Times called this discovery of uses for public data in both government and private markets the "Moneyball" revolution: 
The story is similar in fields as varied as science and sports, advertising and public health — a drift toward data-driven discovery and decision-making.
Formatting consistency (efficiency, intermix), one stop shopping (single portal), and access to data and code for tools (innovation, multi-use), while not a full picture, are a great starting point for understanding what an Open Data Standard is, and why it's important.

Next up, what the TAB shouldn't do.

Recommend reads:
- The full Open Data Handbook.
- Sunlight's How to Implement Open Data Policy (with references to SB283 and Utah's TAB!)

Tuesday, September 10, 2013


Dusting this thing off.

I meant to start writing again this time last year as work began on what would become SB283.  Then I meant to write about the process as SB283 was drafted and passed.  Then I meant to write about the Transparency Advisory Boards new tasks under SB283 and what Open Data (and open data) mean.  Then I meant to write about SB283, Sunlight Foundation's Transparency Camp '13 in DC, Neal Stephenson's "information as power" novels, and what Open Data (and open data) mean.

And somehow it is now September.  You know how it goes.  We can't all be Holly Richardson, who's raised 2,314 children, cans everything that grows, serves on the State Records Cmmte, runs campaigns, and still finds time to write on her blog.  I'd hate her if she wasn't a good person on top of it all.

I've written about it before here but the reverse process from my recent appointment to the TAB, back to passage of SB283, before that the 2011 GRAMA Work Group, and before that the nefarious HB477 is an amazing trip that -- forgive my sappiness -- really reminds you that good things can come from bad, and that overall, Utah lawmakers, legislative staff, and activists alike have common goals.  Differences and disagreements more often come from talking around each other than they do from actually disagreeing when it comes to transparent government.

That's not to say there aren't some who don't care, or even prefer closed doors.  It's not to say there isn't a time to shout "What do you have to hide?!"  Shouting can be useful and fun.  I'm a fan.  But from the stories we heard during the Governor's GRAMA work group to the warm response I've more often than not received from lawmakers to my questions, confusion, and even naivete, it seems like better conversations can and do lead to better things happening.  And as I've also written here before, more members of our legislature are open to that better conversation than not.  None of this would be going forward if Sen. Niederhauser hadn't entertained my half crocked ideas, or if Sen. Henderson hadn't bravely put her name (and patience with me) on this. 

Sometimes, believe it or not, our electeds, cities, and agencies don't want to bury information or access.  They just don't understand what you mean with your fancy JSON files this and your high-falutin' open source that.

I think that's where SB283 and the coming work of the TAB on tackling Open Data Standards comes in.  Utah is already ahead of the curve on technology use and records law.  The board has a lot of ground to cover in an already short period of time.  It probably won't go all of the places I want it to go.  And as Jesse Harris, Phil Windley, Sen. Henderson, Holly Richardson, Patricia "Walking Institution of Knowledge" at the state archivists' office (who's testimony at the GRAMA work group hearings really opened my eyes) and everyone else involved with getting this process off the ground will probably tell you, I struggle with that whole pragmatic thing.  But this board will go some amazing places, and if you take a close look at the final 1/3 of SB283 (the "shall" part), this is just the start of a really important discussion.

I have a lot of things I plan to write about.  Open Data vs. open data.  What the board shouldn't try to do.  What the board is doing (of course).  How this one time I called Sen. Bramble mid-session with a question about my notes from the GRAMA work group and -- get this -- he still hasn't called me back.  Like he was busy at the time or something.  I know, right?!

And one last very important thing for me to get down personally, ahead of what will be my first TAB meeting as an official board member: The Sunlight Foundation.  L(e), Zubedah (The Secretary), "StereoGab," Rebecca with the Cool Last Name, and anyone else near Dupont Circle maybe using a stack of boxes as a desk (by choice) as I type this, this has been a crash course education for me, and you all are great teachers.  The Sunlight Foundation is an understated and irreplaceable resource for cities, states, and even countries working toward healthy government and informed citizenry.  Fun fact: an unexpected meet up and conversation with L(e) thousands of miles from Utah on the Rhode Island waterfront was the first time I'd heard the words "open data standard" and realized how well the very concept answered the questions left in my head after the GRAMA work group wrapped.  How random is that?

I encourage everyone to follow and support their work.  Start with their blog and extensive tools pages.

I never meant to be a "transparency activist."  I was intent and happy with being a loudmouth.  I'm most qualified for the latter, and I honestly have no idea what I'm doing.  But I am really looking forward to writing about and participating in the TAB and the (hopefully) ongoing Open Data Standards discussion.