Court decision metadata and DBpedia

An unplanned sequel.
seal of the US Supreme Court

When I wrote my last two blog entries (not counting the announcement about my new developerWorks article), Modeling your data with DBpedia vocabularies and Big legal publishers and semantic web technology, I had no idea that I would soon stumble across a nice collection of US Supreme Court case metadata in DBpedia. After writing about modeling with DBpedia vocabularies, it occurred to me that if Wikipedia has pages with infoboxes for individual professional wrestlers and Battlestar Galactica episodes, they probably have them for important Supreme Court cases as well. I checked for Roe v. Wade (popular in legal publishing because along with being a famous case, its title is short and easy to spell) and there it was at http://en.wikipedia.org/wiki/Roe_v._Wade. Even better for the semweb geek, its DBpedia page at http://dbpedia.org/page/Roe_v._Wade showed properties for most of the key bits of information you want for a court decision: the date, the reporter volume and page, names of concurring judges, names of dissenting judges, laws applied, and more.

Wikipedia and DBpedia even include my favorite case, Campbell v. Acuff-Rose Music, Inc., in which Appendix B of the Supreme Court decision includes the following lyrics from the 2 Live Crew song that Roy Orbison's publisher sued "Luther Campbell aka Luke Skywalker" (as he's known in the case's dbprop:fullname) over: "Big hairy woman all that hair it ain't legit/'Cause you look like 'Cousin It'". (I like my landmark Supreme Court IP law decisions to include Addams Family references.)

Wikipedia currently has pages for 198 Supreme Court decisions, according to their Category:United States Supreme Court cases page. After going to the DBpedia equivalent of that page, I realized that I could retrieve a list of them all with a simple SPARQL query on DBpedia's query form:

SELECT DISTINCT ?s WHERE {
  ?s 
  <http://www.w3.org/2004/02/skos/core#subject>
  <http://dbpedia.org/resource/Category:United_States_Supreme_Court_cases>
}

Even better, I noticed at the bottom of the Wikipedia page for Campbell v Acuff-Rose that it belonged to the Wikipedia category US copyright case law, a pretty important bit of categorization metadata. Sure, you can look at that page to see the list, but you can also retrieve the list with a slight modification to the SPARQL query above:

SELECT DISTINCT ?s WHERE {
  ?s 
  <http://www.w3.org/2004/02/skos/core#subject>
  <http://dbpedia.org/resource/Category:United_States_copyright_case_law>
}

The most interesting part of the metadata included with the cases is the connections between them. For example, the DBpedia page for Brown v. Board of Education shows that it "is dbpprop:overruled of" Plessy v. Ferguson. The DBpedia page for Plessy v. Ferguson shows that it's dbprop:overruled by Brown v. Board of Education.

There are not enough of these links to threaten a commercial cite-checking service such as LexisNexis's Shepard's product—a lawyer checking whether a potentially citable case was has been overruled is a classic example of when search recall trumps precision, because missing just one search result can be disasterous for the lawyer. Still, the current amount of SPARQL-addressable fielded metadata about US caselaw on Wikipedia (and hence on DBpedia) is a big step beyond the amount of law metadata on the web that was available when I wrote about this in early 2006. It will be great to see this collection grow and to see more applications take advantage of it.