Appreciating SPARQL property paths more

More and more useful.
I had been thinking of property paths as something that could slow down queries, and Paul's experience was that the property path version was more efficient.

I have played with SPARQL 1.1's new property paths features and described them in my book, and I've felt that I understood them for a while, but two recent occasions have helped me to appreciate them even more.

First, to prepare for the talk I'm giving at the Semantic Technology & Business on Enhancing Searches with Semantic Technology, at one point my demo app needed to find a SKOS concept that has either a skos:prefLabel or a skos:hiddenLabel value of a particular string. At first I thought I'd need a UNION query, like this,

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?c
WHERE {
 ?c a skos:Concept .
 {?c skos:prefLabel "motrin"@en }
 UNION
 {?c  skos:hiddenLabel "motrin"@en }
}

but then I realized that the alternative path operator could make it much terser: just two triple patterns in the query, with the second one's predicate expression essentially saying "a predicate of skos:prefLabel or of skos:hiddenLabel":

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT  ?c
WHERE {
 ?c a skos:Concept .
 ?c skos:prefLabel|skos:hiddenLabel "motrin"@en . 
}

The second occasion for appreciating property paths more was reading the recent Paul Groth blog posting 5 heuristics for writing better SPARQL queries, which recommended that we "use property paths to replace connected triple patterns where the object of one triple pattern is the subject of another."

I'd seen examples of the XPath-like property paths, like the foaf:knows/foaf:name one in the SPARQL 1.1 Query Recommendation, but I hadn't realized their value for replacing triple patterns where the object of one triple pattern is the subject of another that has a different predicate, and I've written a lot of those. For example, to find the four-step connection between d:a and d:e in the following,

@prefix d:  <http://learningsparql.com/ns/data#> .
@prefix dm: <http://learningsparql.com/ns/demo#> .

d:a dm:prop1 d:b . 
d:b dm:prop2 d:c . 
d:c dm:prop3 d:d . 
d:d dm:prop4 d:e . 

I would have written a SPARQL graph pattern that looked pretty much like the four triples that you see there, but with variables substituted for d:b, d:c, and d:d. Paul's blog entry made me realize that I could simply write this:

SELECT ?s ?o
WHERE
{ ?s dm:prop1/dm:prop2/dm:prop3/dm:prop4 ?o }

What makes this interesting is that I had been thinking of property paths as something that could slow down queries, and Paul's experience was that the property path version was more efficient. Of course, I was generalizing too much—the property path * and + operators, while very handy, essentially say "and then keep looking for more," which can really increase the search space and execution time. I suppose I was also still hearing the ringing in my ears of the alarm sounded by the paper Counting Beyond a Yottabyte, or how SPARQL 1.1 Property Paths will Prevent Adoption of the Standard (pdf), but that too was focusing on a subset of property paths options unrelated to the path format that Paul was discussing. (After the release of that paper and before SPARQL 1.1's ascent to Recommendation status, the SPARQL Working Group did make adjustments to certain property path features to address the paper's concerns.)

In my formerly extensive use of XSLT, I never got to the point where I couldn't picture being limited to XSLT 1.0, even though 2.0 became a Recommendation in 2007. (I know that Jeni Tennison got to that point about about 2007, if not earlier.) Now that it's been almost four weeks since the SPARQL 1.1 specs became Recommendations, I already have a difficult time being limited to SPARQL 1.0, which is still the case with some endpoints; there's just so much great stuff in 1.1.


Please add any comments to this Google+ post.