9 March 2010

The meaning of "semantics"

No pun intended.

Configured Scenario Semantics

Dave McComb's book Semantics in Business Systems recommended John Saeed's Semantics as an "excellent introductory book on semantics in everyday life", so I found a cheap used copy and have been working my way through it. I'm sure that it's been used for both graduate and undergraduate courses, and it's not too difficult to follow so far. I especially like this part, which Saeed said he adapted from the work of Charles Morris:

syntax: the formal relation of signs to each other;

semantics: the relations of signs to the objects to which the signs are applicable;

pragmatics: the relation of signs to interpreters.

He goes on to say that "the whole science of language, consisting of the three parts mentioned, is called semiotic", but I was more interested in the way he put semantics in the larger context.

Printed and dictionary.com definitions of "semantics" typically come in pairs, with the first usually saying "the study of meaning" and the second more in line with Saeed's definition. The latter is sometimes identified as being specific to the fields of linguistics or semiotics.

I think that the linguistics/semiotics definition serves the semantic web better, because describing semantics as the relations of signs to the things they signify (and moving some of the "meaning" parts that take place in peoples' heads to the "pragmatics" category) helps us to focus on what the semantic web is the best at: providing an infrastructure to identify which signs (IDs in the form of URIs) refer to which objects (resources) so that people can use this infrastructure to create applications that work across the web.

Interpretation of the "meaning" of the signified resources is not necessarily a goal of these applications. While OWL can encode properties of concepts to let us do more reasoning with those concepts, attacking the feasibility of getting computers to Understand Meaning is a straw man argument that I'm tired of hearing from people who insist that the semantic web is an impractical idea. Standards and best practices that let applications track the relationship of identifiers to resources on a World Wide Web scale—who can argue with that?

1 March 2010

Is SPIN the Schematron of RDF?

Represent business rules using an implemented standard, then flagging violations in a machine-readable way.

Many complain about the potentially low quality of public semantic web data, but Fürber and Hepp are doing something about it.

Christian Fürber and Martin Hepp (the latter being the source of the increasingly popular GoodRelations ontology) have published a paper titled "Using SPARQL and SPIN for Data Quality Management on the Semantic Web" (pdf) for the 2010 Business Informations Systems conference in Berlin. TopQuadrant's Holger Knublach designed SPIN, or the SPARQL Inferencing Notation, as a SPARQL-based way to express constraints and inferencing rules on sets of triples, and Fürber and Hepp have taken a careful, structured look at how to apply it to business data.

I knew that "data quality" was a specific discipline within IT, but I hadn't looked at it very closely. Their paper gives a nice overview of this area before moving on to describing their work. It also describes the value that a systematic approach to data quality can bring to semantic web applications, but I don't think anyone needs any convincing there; it's often the first issue people bring up when they hear about the very idea of Linked Data on the web.

Or, to put it more bluntly, many complain about the potentially low quality of public semantic web data, but Fürber and Hepp are doing something about it. SPIN may have the potential to do for RDF data what Schematron has done for XML for years now: providing a technique, based entirely on an existing, well-implemented W3C standard, for describing business rules about data and then validating data against those rules. (I see that William Vambenepe had some thoughts on the comparison early last year.)

I'm looking forward to Fürber and Hepp's future work described in their paper and to seeing how others apply it in their applications.

21 January 2010

Using the ARQ SPARQL processor from the command line

With the Jena extensions.

I recently described how to execute Federated SPARQL queries that use Jena extensions that we'll hopefully see added to the SPARQL 1.1 standard. I showed a sample query and suggested that you try it at the sparql.org RDF Query Demo page.

For local, command-line use of SPARQL, I've used the Jena ARQ query engine for years, but my sample federated query didn't work with it, and now I know why: the sparql.bat file that comes with the distribution invokes the processor in a strictly standards-compliant mode without the extensions enabled. I thought I'd have to write and compile some Java code to use the extensions, but my co-worker Jeremy Carroll pointed out that the sparql.bat file in ARQ's bat subdirectory calls the arq.sparql library, like this,

java -cp %CP% arq.sparql %*

and that calling the arq.arq library instead enables the extensions. Then, I noticed the arq.bat file in the same directory as sparql.bat, and this is exactly what it does. There are more batch files in there, and a web search on their names led me to an ARQ - Command Line Applications documentation page, which will be handy.

Using arq.bat instead of sparql.bat, the sample federated query works as written (tested with ARQ 2.8.2), and so does LET assignment and extension functions, making it possible to use ARQ in real semantic web application development with no need to do Java coding around the Jena API.

(Thanks again, Jeremy!)

Recent Tweets

    Feeds

    [What are these?]
    Atom 1.0 (summarized entries)
    Atom 1.0 (full entries)
    RSS 1.0
    RSS 2.0
    Gawker Artists