Free your data and the rest will follow


Alice questions the feasibility of my utopian legal research scheme and wonders if it would end up making research harder, by killing off West and Lexis. I highly doubt that courts publishing their cases in a standard electronic format would kill off West or Lexis, or even have a significant effect on their revenues. In fact, it might make it even cheaper for those services to add new data (although probably not. I assume they get material in some sort of electronic format already.)
Even if data is available free, there is still room for premium research services.

People are willing to pay for data that is otherwise free, if it is well organized and easy to search. For example, eMarketer (my former employer) does this for internet stats. West and Lexis provide a significant added value to the raw data (the value of human editing, classifying cases into the relevant materials.) What I’m suggested is a standard for distributing the electronic equivalent of official court reporters (which, according to a law librarian, “you’re never going to actually use.”)
Open legal data is not going to replace the proprietary databases, and probably won’t even make legal research all that much cheaper. What it does do is to allow law libraries, schools and firms to be able to create their own unique tools based on freely available data. A firm could create an electronic database of case law focusing in on a certain area, and index it using a much more detailed taxonomy than West’s. A tool that spiders through cases and pulls out the links between them from citations (ala Blogdex), and give entities the opportunity to have research tools that are better suited to their individual needs. A firm’s internal research system could store their own metadata alongside the primary materials. Open formats would open legal research to new creative, specialized tools. This is not creating a single monolithic scheme, but opening the door to many specialized and innovative search schemes. For example, look at Google API and offer lawyers a larger number of unique, specialized tools.
The biggest obstacle to this being useful anytime soon is not convincing the judges of the relevance– electronic publishing will likely be cheaper than publishing on paper– but in getting out the 200+ years of historical data that’s in books and proprietary databases into an open, standard format.
EDIT: Donna Wentworth asks:

Let’s just say that someone had a copy of the Eldred Supreme Court transcripts, culled from the generous-yet-decidedly-proprietary databanks of Lexis-Nexis. Could that someone then go ahead and publish the transcripts on her weblog?

On the same theme, wouldn’t that be easier if there was public access to public court records?

Andrew Raff @andrewraff