
A Failure of Access, a Shortcoming of Technology
by Clay Northouse, 5/28/2008
Access to government data and other information often falls behind expectations due to the government's failure to use advanced technologies to meet the needs of modern day society. In "Hack, Mash, & Peer," Jerry Brito, Senior Research Fellow of the Mercatus Center at George Mason University, discusses the shortcomings of government access and technological solutions to create broad access to government records.
The analysis, published May 14 in the Columbia Science and Technology Law Review, shows that many government data sources are essentially inaccessible to the general public. For instance, the government only permits information regarding the financial disclosures of members of Congress to be viewed in paper format at the House or Senate offices in Washington, DC. Even though disclosure of the records is required by law, and even though those records are stored in a searchable electronic database, government denies the general public easy online access to that information.
Other data the government makes available online in centralized locations but publishes in cumbersome formats, which makes it difficult to search and find information. "While efficient in theory," states Brito, "consolidation may be a step backward if the centralized database does more to obscure data than to make it easily accessible."
Filling the access gap, private sector third parties have stepped in with "ingenious hacks" to provide the functionality the government has failed to achieve. The Center for Responsive Politics (CPR) runs the OpenSecrets.org website, which provides the general public with easy online access to many useful government data sources including campaign finance information, lobbying information, and congressional travel. CPR took upon itself the labor-intensive effort to digitize the paper records of congressional members' financial disclosures and posted the data in a searchable database. Another example is the FedSpending.org website, developed by OMB Watch, that provides access to federal contract spending and financial assistance. The GovTrack.us website, developed by a linguistics graduate student, provides access to legislation information by scraping and collecting information from government web pages.
Often these "hacks" present the government data in a structured and open format that allows others to combine various data sources in "mashups" that represent new novel tools for reviewing information. For instance, the MAPLight.org website pulls together data on voting records and campaign finance information to generate unique insights into the interaction of money and politics. The website shows when and how much money was contributed to campaigns by those supporting and those opposing legislation and then how votes on legislation turned out.
Third-party groups seeking to solve the problem of large amounts of information provided in cumbersome formats recently developed the "peer production" or "crowdsourcing" approach. Crowdsourcing is when massive numbers of documents or other information are reviewed en masse by a community of online users. The paper details an example in which over 3,000 pages of documents related to the firing of eight U.S. Attorneys were reviewed overnight by TPMMuckraker.com blog readers. The blog posted a request for help reviewing the materials and provided readers with a system for posting comments on read pages. In approximately seven hours, the site visitors had read and commented on almost all of the pages. Crowdsourcing leverages the cooperative effort of large numbers of people to accomplish huge information tasks in an extremely short amount of time. The explosion of blogs also creates an online environment rich with opportunities to pursue crowdsourcing projects.
Rather than just relying on third parties to hack, mash, and peer government data, Brito recommends that government encourage the process itself by making data available online in "structured, open, and searchable formats." Structured means that the information should be provided in a way that can be read by feed readers and search engines. Open means that the data should be provided in a nonproprietary fashion to enable the combination of data with other sources and creation of different types of products, like overlapping housing data with mapping or providing information about toxics in a searchable format on a local community website. Finally, the data should allow for full-text searches.
To accomplish such access to government information, "Hack, Mash, & Peer" recommends that legislation specifically require such disclosure methods. However, if Congress fails to act, agencies should take it upon themselves to provide government information in robust and useable formats.
