Hack, Mash, and Peer: Crowdsourcing Government Transparency

The federal government makes an overwhelming amount of data publicly available each year. Laws ranging from the Administrative Procedure Act to the Paperwork Reduction Act require these disclosures

This paper appears in the Columbia Science & Technology Law Review Vol. IX, published May 2008.

Summary

The federal government makes an overwhelming amount of data publicly available each year. Laws ranging from the Administrative Procedure Act to the Paperwork Reduction Act require these disclosures in the name of transparency and accountability. However, the data are often only nominally publicly available. First, this is the case because it is not available online or even in electronic format. Second, the data that can be found online is often not available in an easily accessible or searchable format. If government information was made public online and in standard open formats, the online masses could be leveraged to help ensure the transparency and accountability that is the reason for making information public in the first place.

When the government makes data available in a structured format, it opens the doors to innovative and enlightening remixes of information known as mashups. Mashups, in turn, are tools that can potentially be used by journalists, bloggers, and citizens-the Internet's intelligent crowds-to better scrutinize government's activities. When government does not make data available online, or makes it available but not in a structured format, third parties take it upon themselves to fill the void by implementing ingenious "hacks" to free the data.

Our Findings

  • Government information that is nominally publicly available is in fact difficult to access either because it is not online or, if it is online, because it is not available in useful and flexible formats.  
  • Independent third parties have improvised where government has failed and made public information available online in flexible formats.  
  • Recommendations for government to improve its online offerings. It also makes the case that until government does improve, private parties can fill the breach.

By the Numbers

  • Twenty-seven federal agencies have migrated their dockets to Regulations.gov, which according to OMB accounts for 82 percent of all federal regulations. While this progress toward centralization has been hailed as a success, it may in fact be a disaster. While efficient in theory, consolidation may be a step backward if the centralized database does more to obscure data than to make it easily accessible.   A few days after Regulations.gov won an award from Government Computer News, the Congressional Research Service (CRS) issued a report outlining serious questions regarding the site, including "the general navigability of the website, the consistency and completeness of the data, [and] whether the system allows users to adequately search existing dockets." The report catalogues several attempts by CRS to find information using the sites navigation or search functions that where not simply unsuccessful, but thoroughly confusing as well.

Recommendations 

  • To the greatest extent feasible, government data should me made public online. 
  • Information should not just be made available online, but online resources must also be useful. This means putting data online in a structured, open and searchable formats. 
  • Ideally, government would provide the necessary informational building blocks. After all, it is the source of the data and it could ensure its completeness and accuracy. Government has the power to enact reforms to make the data it produces easily open to the public. 
  • If government does not do this, however, the private sector should fill the breach.