SPARQL and live relational data

A little demo.
Chirac and Sarkozy

In the first project I did with SPARQL, D2RQ, and MySQL I used D2RQ to pull all the relational data into a disk file and then queried that after adding some OWL-based metadata. D2RQ does let you execute SPARQL queries against a live relational database, instead of dumping data to a file and querying that, so I wanted to see the effects for myself. This would work better as a live demo, but you could think of it as a script for one.

First, because MySQL is a multi-user database, imagine that several users are simultaneously using the same copy of the "world" database that I described in an earlier entry. This will make my fake demo look more dramatic. (For additional drama, imagine bullets whizzing by my head as I type the various queries and commands.) I'll start with a SPARQL query asking about the head of state for France:

SELECT ?headOfState WHERE { 
?s vocab:country_Name "France";
   vocab:country_HeadOfState ?headOfState.

With the version of the world database currently available from MySQL, that query returns "Jacques Chirac". In fact, the database lists him as the head of state for several countries; this query

?s vocab:country_Name ?name;
   vocab:country_HeadOfState "Jacques Chirac".

returns this list:

"French Guiana"
"French Polynesia"
"Saint Pierre and Miquelon"
"New Caledonia"
"Wallis and Futuna"
"French Southern territories"

Now imagine that someone else using the same database updates it with the following query at the MySQL command line:

mysql> UPDATE country
    -> SET HeadOfState="Nicolas Sarkozy"
    -> WHERE HeadOfState="Jacques Chirac";
Query OK, 11 rows affected (0.08 sec)
Rows matched: 11  Changed: 11  Warnings: 0

(I look forward to making a similar update for the United States entry in January.) When I rerun my original SPARQL query about the vocab:country_HeadOfState value for the subject that has a country name of "France", I get the updated answer: "Nicolas Sarkozy".

When an interface such as D2RQ provides access to a relational database, SPARQL provides an excellent tool for looking at the data. Of course, if you can access that database using an SQL command line, you have even more options, but how many publicly accessible relational databases let you issue SQL commands against them? More and more offer SPARQL access, so SPARQL will be an increasingly valuable tool for getting at increasing amounts data. (Not that SPARQL's future is limited to read-only access—an UPDATE language for SPARQL is in the works.)


Have a question about the statement "but how many publicly accessible relational databases let you issue SQL commands against them?More and more offer SPARQL access, so SPARQL will be an increasingly valuable tool for getting at increasing amounts data ".

When I search for information on a Website (not search engines),lets say Geonames and I look for "New York".Isn't the search a query against a database?Plenty of websites, I think provide querying against publicly accessible relational database.The complexity of learning and writing SQL is hidden from the end user.

In the case of Geonames,its a MySQL based store.It makes it easy for a naive user to search for information in Geonames because there is no necessity to learn the query language or SQL.

My question

(1) Isn't the pain of learning and writing SPARQL one of the biggest hindrance in as you have it "becoming an increasingly valuable tool for getting at increasing amounts data".?

(2) Or because of this it will continue to remain a tool in the hands of people of SW Community?.

As you quoted, I did say "let you issue SQL commands against them," not "query the databases," so I wouldn't count form-driven queries against MySQL backends as relational queries of public data. You don't have the flexibility to make up your own queries. As a matter of fact, I'm sure we'll see more forms triggering SPARQL queries on the back end over time, so comparing SPARQL queries to form-driven queries of relational databases is not an apples-to-apples comparison.

We could call SPARQL a tool in the hands of the SW community, but we could also call SQL a tool in the hands of the relational community. The biggest difference to me is that if you know SQL well, your options for writing an app that combines data from multiple public SQL databases are very limited. You query your personal data or your employer's. The increasing amount of SPARQL-accessible data is what's opening up the possibilities.