<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>bobdc.blog</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/" />
    <link rel="self" type="application/atom+xml" href="http://www.snee.com/bobdc.blog/atom.xml" />
    <id>tag:www.snee.com,2008-10-31:/bobdc.blog/2</id>
    <updated>2010-03-09T23:50:12Z</updated>
    <subtitle>Bob DuCharme&apos;s weblog, mostly on technology for representing and linking information.</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.32-en</generator>

<entry>
    <title>The meaning of &quot;semantics&quot;</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2010/03/the-meaning-of-semantics.html" />
    <id>tag:www.snee.com,2010:/bobdc.blog//2.595</id>

    <published>2010-03-09T23:48:11Z</published>
    <updated>2010-03-09T23:50:12Z</updated>

    <summary>No pun intended.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
    <category term="semanticweb" label="semanticweb" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        No pun intended.
        <![CDATA[    <div id="id103296">

<a id="id103298" href="http://www.flickr.com/photos/julianbleecker/269633724/"><img id="id103303" src="http://farm1.static.flickr.com/79/269633724_459cd88a11.jpg" border="0" align="right" hspace="30px" vspace="30px" alt="Configured Scenario Semantics" width="280"/></a>

<p id="id103324">Dave McComb's book <a id="id103327" href="http://www.amazon.com/exec/obidos/ISBN=1558609172/bobducharmeA/ ">Semantics in Business Systems</a> recommended John Saeed's <a id="id103334" href="http://www.amazon.com/exec/obidos/ISBN=1405156392/bobducharmeA/ ">Semantics</a> as an "excellent introductory book on semantics in everyday life", so I found a cheap used copy and have been working my way through it. I'm sure that it's been used for both graduate and undergraduate courses, and it's not too difficult to follow so far. I especially like this part, which Saeed said he adapted from the work of Charles Morris:</p>
<blockquote id="id103349">
<p id="id103352">syntax: the formal relation of signs to each other;</p>
<p id="id103356">semantics: the relations of signs to the objects to which the signs are applicable;</p>
<p id="id103361">pragmatics: the relation of signs to interpreters.</p>
</blockquote>
<p id="id103367">He goes on to say that "the whole science of language, consisting of the three parts mentioned, is called semiotic", but I was more interested in the way he put semantics in the larger context.</p>

<p id="id103375">Printed and <a id="id103377" href="http://dictionary.reference.com/browse/semantics">dictionary.com</a> definitions of "semantics" typically come in pairs, with the first usually saying "the study of meaning" and the second more in line with Saeed's definition. The latter is sometimes identified as being specific to the fields of  linguistics or semiotics.</p>
<p id="id103390">I think that the linguistics/semiotics definition serves the semantic web better, because describing semantics as the relations of signs to the things they signify (and moving some of the "meaning" parts that take place in peoples' heads to the "pragmatics" category)  helps us to focus on what the semantic web is the best at: providing an infrastructure to identify which signs (IDs in the form of URIs) refer to which objects (resources) so that people can use this infrastructure to create applications that work across the web.</p>
<p id="id103403">Interpretation of the "meaning" of the signified resources is not necessarily a goal of these applications. While <a id="id103407" href="http://www.snee.com/bobdc.blog/2008/05/adding-semantics-to-make-data.html">OWL</a> can encode properties of concepts to let us do more reasoning with those concepts, attacking the feasibility of getting computers to Understand Meaning is a straw man argument that I'm tired of hearing from people who insist that the semantic web is an impractical idea. Standards and best practices that let applications track the relationship of identifiers to resources on a World Wide Web scale&#8212;who can argue with that?</p>

<p id="id103436"><div id="id103441" about="http://www.flickr.com/photos/julianbleecker/269633724/" style="font-size:8pt">(photo: <a id="id103450" rel="cc:attributionURL" href="http://www.flickr.com/photos/julianbleecker/">http://www.flickr.com/photos/julianbleecker/</a> / <a id="id103460" rel="license" href="http://creativecommons.org/licenses/by-nc-nd/2.0/">CC BY-NC-ND 2.0)</a></div></p>



    </div>]]>
    </content>
</entry>

<entry>
    <title>Is SPIN the Schematron of RDF?</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2010/03/is-spin-the-schematron-of-rdf.html" />
    <id>tag:www.snee.com,2010:/bobdc.blog//2.591</id>

    <published>2010-03-01T23:56:45Z</published>
    <updated>2010-03-02T00:07:18Z</updated>

    <summary>Represent business rules using an implemented standard, then flagging violations in a machine-readable way.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
    <category term="rdf" label="rdf" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="schematron" label="schematron" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="spin" label="spin" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Represent business rules using an implemented standard, then flagging violations in a machine-readable way.
        <![CDATA[    <div id="id103297">

<blockquote id="id103300" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103311">Many complain about the potentially low quality of public semantic web data, but Fürber and Hepp are doing something about it.</strong></blockquote>

<p id="id103319"> Christian Fürber and Martin Hepp (the latter being the source of the increasingly popular <a id="id103325" href="http://www.heppnetz.de/projects/goodrelations/">GoodRelations</a> ontology) have published a paper titled "Using SPARQL and SPIN for Data Quality 
Management on the Semantic Web" (<a id="id103336" href="http://www.heppnetz.de/files/fuerber-hepp-sparql-spin-dqm.pdf">pdf</a>) for the 2010 <a id="id103343" href="http://bis.kie.ae.poznan.pl/13th_bis/">Business Informations Systems</a> conference in Berlin. TopQuadrant's Holger Knublach designed SPIN, or the <a id="id103352" href="http://www.spinrdf.org/">SPARQL Inferencing Notation</a>, as a SPARQL-based way to express constraints and inferencing rules on sets of triples, and Fürber and Hepp have taken a careful, structured look at how to apply it to business data.</p>
<p id="id103367">I knew that "data quality" was a specific discipline within IT, but I hadn't looked at it very closely. Their paper gives a nice overview of this area before moving on to describing their work. It also describes the value that a systematic approach to data quality can bring to semantic web applications, but I don't think anyone needs any convincing there; it's often the first issue people bring up when they hear about the very idea of Linked Data on the web.</p>
<p id="id103378">Or, to put it more bluntly, many complain about the potentially low quality of public semantic web data, but Fürber and Hepp are doing something about it. SPIN may have the potential to do for RDF data what <a id="id103388" href="http://xml.ascc.net/resource/schematron/schematron.html">Schematron</a> has done for XML for years now: providing a technique, based entirely on an existing, well-implemented W3C standard, for describing business rules about data and then validating data against those rules. (I see that William Vambenepe <a id="id103399" href="http://stage.vambenepe.com/archives/496">had some thoughts</a> on the comparison early last year.)</p>
<p id="id103408">I'm looking forward to Fürber and Hepp's future work described in their paper and to seeing how others apply it in their applications.</p>


    </div>
]]>
    </content>
</entry>

<entry>
    <title>Using the ARQ SPARQL processor from the command line</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2010/01/using-the-arq-sparql-processor.html" />
    <id>tag:www.snee.com,2010:/bobdc.blog//2.582</id>

    <published>2010-01-21T15:38:55Z</published>
    <updated>2010-01-21T15:40:54Z</updated>

    <summary>With the Jena extensions.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="SPARQL" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="sparql" label="sparql" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        With the Jena extensions.
        <![CDATA[
    <div id="id103311">

<p id="id103314">I recently described how to execute <a id="id103317" href="http://www.snee.com/bobdc.blog/2010/01/federated-sparql-queries.html">Federated SPARQL queries</a> that use Jena extensions that we'll hopefully see added to the SPARQL 1.1 standard. I showed a sample query and suggested that you try it at the <a id="id103326" href="http://www.sparql.org/query.html">sparql.org RDF Query Demo page</a>.</p>
<p id="id103335">For local, command-line use of SPARQL, I've used the Jena <a id="id103339" href="http://jena.sourceforge.net/ARQ/">ARQ</a> query engine for years, but my sample federated query didn't work with it, and now I know why: the sparql.bat file that comes with the distribution invokes the processor in a strictly standards-compliant mode without the extensions enabled. I thought I'd have to write and compile some Java code to use the extensions, but my co-worker Jeremy Carroll pointed out that the sparql.bat file in ARQ's bat subdirectory calls the arq.sparql library, like this,</p>
<pre id="id103354">
java -cp %CP% arq.sparql %*
</pre>
<p id="id103359">and that calling the arq.arq library instead enables the extensions. Then, I noticed the arq.bat file in the same directory as sparql.bat, and this is exactly what it does. There are more batch files in there, and a web search on their names led me to an <a id="id103366" href="http://jena.sourceforge.net/ARQ/cmds.html">ARQ - Command Line Applications</a> documentation page, which will be handy. </p>

<p id="id103376">Using arq.bat instead of sparql.bat, the sample federated query works as written (tested with ARQ 2.8.2), and so does LET assignment and <a id="id103378" href="http://jena.sourceforge.net/ARQ/library-function.html">extension functions</a>, making it possible to use ARQ in real semantic web application development with no need to do Java coding around the Jena API.</p>
<p id="id103388">(Thanks again, Jeremy!)</p>
</div>]]>
    </content>
</entry>

<entry>
    <title>Live stock ticker data in RDF</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2010/01/live-stock-ticker-data-in-rdf.html" />
    <id>tag:www.snee.com,2010:/bobdc.blog//2.580</id>

    <published>2010-01-12T16:19:22Z</published>
    <updated>2010-01-12T16:21:53Z</updated>

    <summary>Well, on a 20-minute delay.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="RDF" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="rdfwebservice" label="rdf webservice" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Well, on a 20-minute delay.
        <![CDATA[    <div id="id103294">

<p id="id103297">I've played with finance.yahoo.com's feed of CSV stock ticker data <a id="id103299" href="http://www.snee.com/bobdc.blog/2008/10/using-the-twitter-api-to-alert.html">before</a> and recently had an idea that was so simple that I'm surprised that no one's done it before: why not write a script that passes along a request for this data but converts the result to RDF before returning it? So I did.</p>

<blockquote id="id103312" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103323">I supposed it might count as a semantic web service.</strong></blockquote>

<p id="id103328">A URL like <a id="id103330" href="http://www.rdfdata.org/cgi/stockquotes.cgi?symbols=BUD,IBM,SNE">http://www.rdfdata.org/cgi/stockquotes.cgi?symbols=BUD,IBM,SNE</a> asks for recent ticker information about the stock symbols listed in the comma-separated value list. The stockquotes.cgi script adds the parameters to the appropriate stub to create a URL like <a id="id103342" href="http://download.finance.yahoo.com/d/quotes.csv?f=sl1d1t1ohgv&amp;e=.csv&amp;s=BUD,IBM,SNE">http://download.finance.yahoo.com/d/quotes.csv?f=sl1d1t1ohgv&amp;e=.csv&amp;s=BUD,IBM,SNE</a>, uses this URL to retrieve the CSV results, converts them to RDF/XML, and sends that back to the original requester with a MIME type of application/rdf+xml. The whole script, with white space and comments, wasn't even 100 lines. You can click the first link in this paragraph to see an example of it in action. </p>

<p id="id103359">I haven't done anything with the rdfdata.org domain name in a while, so I thought that would be a nice place for this. I've already used this little web service in a work-related demo that combines and cross-references RDF data from multiple sources, because after all, that's one of the things that RDF is so good at. </p>
<p id="id103368">Is this a "semantic web service"? All it does is convert the data returned by a Yahoo feed into a different syntax and pass it along. I did throw together a little ontology to name the properties, but it doesn't add a lot of semantics. On the other hand, my script's output syntax is based on a semantic web standard, and it makes the data easier to use in semantic web applications, so I suppose it might count as a semantic web service. </p>
<p id="id103379">I hope this is useful to others, and I hope that more people look for opportunities to convert live feeds of useful data in simple formats into live feeds of RDF. </p>

    </div>
]]>
    </content>
</entry>

<entry>
    <title>Federated SPARQL queries</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2010/01/federated-sparql-queries.html" />
    <id>tag:www.snee.com,2010:/bobdc.blog//2.577</id>

    <published>2010-01-04T18:07:44Z</published>
    <updated>2010-01-04T18:15:36Z</updated>

    <summary>Using a Jena extension.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="SPARQL" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="sparql" label="sparql" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Using a Jena extension.
        <![CDATA[    <div id="id103296">




<p id="id103313">Much of the promise of RDF and Linked Data is the ease of pulling data from multiple sources and combining it. I recently discovered the SERVICE extension that Jena adds to SPARQL, letting you send subqueries off to multiple SPARQL endpoints and then combine the results. Because a given SPARQL endpoint may be an interface to a triplestore or  a  relational data store or something else, the ability to query several endpoints with one query is very nice.</p>

<blockquote id="id103299" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103309">The ability to query several endpoints with one query is very nice.</strong></blockquote>

<p id="id103325">The Jena project's <a id="id103328" href="http://jena.sourceforge.net/ARQ/service.html">ARQ - Basic Federated SPARQL Query</a> describes the use of this keyword. Before I start quoting from that page, I wanted to jump right in with an example that worked for me to pull birthday and spouse information about Arnold Schwarzenegger from <a id="id103339" href="http://dbpedia.org">DBpedia</a> and a list of his movies and their release dates from <a id="id103347" href="http://www.linkedmdb.org/">Linked Movie Database</a> in one query:</p>
<pre id="id103356">
PREFIX imdb: &lt;http://data.linkedmdb.org/resource/movie/&gt;
PREFIX dcterms: &lt;http://purl.org/dc/terms/&gt;
PREFIX dbpo: &lt;http://dbpedia.org/ontology/&gt;
PREFIX rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt;

SELECT ?birthDate ?spouseName ?movieTitle ?movieDate {
  { SERVICE &lt;http://dbpedia.org/sparql&gt;
    { SELECT ?birthDate ?spouseName WHERE {
        ?actor rdfs:label "Arnold Schwarzenegger"@en ;
               dbpo:birthDate ?birthDate ;
               dbpo:spouse ?spouseURI .
        ?spouseURI rdfs:label ?spouseName .
        FILTER ( lang(?spouseName) = "en" )
      }
    }
  }
  { SERVICE &lt;http://data.linkedmdb.org/sparql&gt;
    { SELECT ?actor ?movieTitle ?movieDate WHERE {
      ?actor imdb:actor_name "Arnold Schwarzenegger".
      ?movie imdb:actor ?actor ;
             dcterms:title ?movieTitle ;
             dcterms:date ?movieDate .
      }
    }
  }
}
</pre>

<p id="id103386">You can run this query yourself at the <a id="id103389" href="http://www.sparql.org/query.html">sparql.org RDF Query Demo page</a>.</p>
<p id="id103398">Before you start modeling your own queries on this, it's worth reading the Jena documentation page mentioned above, especially the "Performance Considerations" part: </p>
<blockquote id="id103401">
This feature is a basic building block to allow remote access in the middle of a query, not a general solution to the issues in distributed query evaluation. The algebra operation is executed without regard to how selective the pattern is. So the order of the query will affect the speed of execution. Because it involves HTTP operations, asking the query in the right order matters a lot. Don't ask for the whole of a bookstore just to find book whose title comes from a local RDF file - ask the bookshop a query with the title already bound from earlier in the query.
</blockquote>
<p id="id103424">As an example, both subqueries above specifically ask for information about Schwarzenegger instead of trying to scan the complete databases looking for matches.</p>
<p id="id103430">Two parts of this trick are non-standard SPARQL, but may become part of SPARQL 1.1: <a id="id103434" href="http://www.slideshare.net/LeeFeigenbaum/sparql2-status/8">subqueries</a> and the <a id="id103442" href="http://www.slideshare.net/LeeFeigenbaum/sparql2-status/15">SERVICE keyword</a>. As the latter Lee Feigenbaum slide points out, the SPARQL Working Group is using ARQ's SERVICE keyword as a starting point in thinking about how a query can target multiple endpoints.  </p>

<p id="id103454">My query above of the two different SPARQL endpoints also works from within TopQuadrant's TopBraid Suite of products, so I'm sure I'll be using this on work-related projects more and more.</p>



    </div>
]]>
    </content>
</entry>

<entry>
    <title>RDFS: The primary document</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/11/rdfs-the-primary-document.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.566</id>

    <published>2009-11-29T17:10:41Z</published>
    <updated>2009-11-29T17:12:14Z</updated>

    <summary>Shorter and more interesting than I remember.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="RDF/OWL" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="rdfs" label="rdfs" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Shorter and more interesting than I remember.
        <![CDATA[    <div id="id103294">

<p id="id103297">About two years ago I <a id="id103299" href="http://www.snee.com/bobdc.blog/2006/09/rdfs-without-rdfowl.html">wondered</a> if RDF Schema had become merely a layer of OWL or if anyone used RDFS by itself without OWL. My theory was that because tools such as TopBraidComposer, Protege, and SWOOP that let you design RDFS vocabularies also let you assign OWL properties to your classes, people used those because they were there, and we ended up with few pure RDFS vocabularies. </p>

<blockquote id="id103315" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103325">I heartily recommend that you read the first 11 or 18 page of the RDFS spec and skim the rest.</strong></blockquote>

<p id="id103331">Lately, though, it seems that a lot of people who had been using the terms vocabulary/taxonomy/ontology interchangeably have started to understand better when OWL is too much. As they review the issues surrounding the choice between OWL 1 Lite, DL, and Full, around OWL 2 EL, QL, and RL, and the implications of open vs. closed world assumptions, more attitudes can be summarized as "sounds interesting, but pretty complicated; maybe later." This makes good sense for people whose main interest is defining a standardized vocabulary. </p>
<p id="id103344">SKOS looks pretty good to more and more of them, but here I want to focus on RDFS. As I thought more about it recently, I realized that I had never read the <a id="id103349" href="http://www.w3.org/TR/2004/REC-rdf-schema-20040210/">RDF Schema Recommendation</a>, so about five years late I sat down to do so. It's nice to remember, when you're wondering about the true meaning of some term or the relationship between some concepts, that a spec is available where you can just read the official explanation of what's what. (Of course, <a id="id103361" href="http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/">some specs</a> are less enlightening than others when you're confused about what they describe.)</p>
<p id="id103371">I found the RDFS Recommendation to be an interesting mix of simple things that are commonly used and complex things that are rarely used. When I printed it out, it was 27 pages, but the summaries and references start on page 18, and the appropriately titled <a id="id103378" href="http://www.w3.org/TR/2004/REC-rdf-schema-20040210/#ch_othervocab">Other Vocabulary</a> section on pages 12 through 17 describes the rarely used features. Let's look at some interesting parts that lead up to that. From the Abstract:</p>
<blockquote id="id103388">
This specification describes how to use RDF to describe RDF vocabularies.
</blockquote>
<p id="id103394">Maybe that's obvious to some, but it's reassuring when confusion over vocabularies, taxonomies, and ontologies comes up. From the introduction:</p>
<blockquote id="id103398">
The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web.
</blockquote>
<p id="id103405">As opposed to being a data model. (It's certainly not a syntax!)</p>
<p id="id103410">Why do we need this schema language? </p>
<blockquote id="id103415">
<p id="id103417">RDF properties may be thought of as attributes of resources and in this sense correspond to traditional attribute-value pairs. RDF properties also represent relationships between resources.</p>
<p id="id103424">RDF however, provides no mechanisms for describing these properties, nor does it provide any mechanisms for describing the relationships between these properties and other resources. That is the role of the RDF vocabulary description language, RDF Schema. RDF Schema defines classes and properties that may be used to describe classes, properties and other resources.</p>
</blockquote>
<p id="id103441">The following is interesting for two reasons: first, because it describes a member of a class as an "instance," reminding me that "individual" is definitely an an OWL term that has no particular role in RDFS. (A little later the document tells us that "the members of a class are known as <em id="id103449">instances</em> [their emphasis] of the class".) It's also interesting as a nice summary of an issue that often confuses people with an object-oriented background.</p>
<blockquote id="id103457">
<p id="id103459">The RDF vocabulary description language class and property system is similar to the type systems of object-oriented programming languages such as Java. RDF differs from many such systems in that instead of defining a class in terms of the properties its instances may have, the RDF vocabulary description language describes properties in terms of the classes of resource to which they apply. This is the role of the domain and range mechanisms described in this specification. For example, we could define the <code id="id103470">eg:author</code>
property to have a domain of <code id="id103475">eg:Document</code> and a range of
<code id="id103480">eg:Person</code>, whereas a classical object oriented system might
typically define a class <code id="id103486">eg:Book</code> with an attribute called
<code id="id103491">eg:author</code> of type <code id="id103496">eg:Person</code>. Using the RDF approach, it is easy for others to subsequently define additional properties with a domain of eg:<code id="id103502">Document</code> or a range of <code id="id103506">eg:Person</code>.
This can be done without the need to re-define the original description of
these classes. One benefit of the RDF property-centric approach is that it
allows anyone to extend the description of existing resources, one of the
architectural principles of the Web.</p>
</blockquote>
<p id="id103520">The role and relationship of the <code id="id103524">rdfs:domain</code> and <code id="id103528">rdfs:range</code> properties have confused me and <a id="id103533" href="http://twitter.com/JeniT/status/5272938272">many others</a>. The spec's description of their use is rather technical (nothing wrong with that; it's a spec) but there's this nice passage after that: </p>
<blockquote id="id103543">
<p id="id103546">...an RDF vocabulary might describe limitations on the types of values that are appropriate for some property, or on the classes to which it makes sense to ascribe such properties.</p>
<p id="id103553">The RDF Vocabulary Description language provides a mechanism for describing this information, but does not say whether or how an application should use it...</p>
<p id="id103559">For example, data checking tools might use this to help discover errors in some data set, an interactive editor might suggest appropriate values, and a reasoning application might use it to infer additional information from instance data.</p>
<p id="id103567">RDF vocabularies can describe relationships between vocabulary items from multiple independently developed vocabularies. Since URI-References are used to identify classes and properties in the Web, it is possible to create new properties that have a <code id="id103574">domain</code> or <code id="id103578">range</code> whose value is a class defined in another namespace.</p>
</blockquote>

<p id="id103585">I think that makes some basic issues clearer. </p>

<p id="id103590">I have mixed feelings about the "Other vocabulary" section on features that, from what I've seen, never got much traction: container classes and properties, RDF collections, and reification. On the one hand, usage of these can appear so complex that I think it scared a lot of people away from RDF in the early days, obscuring the simplicity of the triple as the fundamental concept of RDF. On the other hand, as I read about these options now, they looked like they could be fun to play with, in a geeky sort of way. (I also realize that the whole concept of reification&#8212;the ability to refer to triples as resources themselves so that properties can be assigned to them&#8212;is an important bit of RDF foundational architecture for other good RDF-related ideas to build on.)</p>

<p id="id103896">So, whether you're new to the whole idea of a standardized definition of a vocabulary or you've been using OWL and RDFS together for years, I heartily recommend that you read the first 11 or 18 page of the RDFS spec and skim the rest, which includes some handy reference material.</p>

    </div>]]>
    </content>
</entry>

<entry>
    <title>Converting Word documents to DITA</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/11/converting-word-documents-to-d.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.565</id>

    <published>2009-11-20T14:37:31Z</published>
    <updated>2009-11-20T14:40:34Z</updated>

    <summary>Via OpenOffice and DocBook.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="DITA" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="worddocbookdita" label="word docbook dita" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Via OpenOffice and DocBook.
        <![CDATA[    <div id="id103296">

<p id="id103318">I recently had to convert a few Microsoft Word documents to DITA XML and thought it would be worth sharing my notes on the steps I took. To summarize, I opened each Word document with OpenOffice 3.1, saved it as a DocBook XML document, and then converted that to DITA with the XSLT stylesheet from a DITA plugin that I found. Images were a little more trouble, but at least I was able to eventually automate that part as well, dispelling my worries that I'd have to add all the image references to the DITA files by hand. </p>

<div id="id103330">
  <h2 id="id103333">Word to DocBook</h2>

<img id="id103300" src="http://www.snee.com/bobdc.blog/img/word2dita.jpg" border="0" align="right" hspace="30px" vspace="30px" alt="Word and DITA logos"/>

<p id="id103337">When you open a Word file with OpenOffice and do a Save As DocBook, it assumes that the document uses default Word styles, because that's how OpenOffice knows what's what in the document's structure. The conversion does an impressive job of adding wrappers in the appropriate places considering that it's using an XSLT 1.0 stylesheet. This kind of stylesheet would be <a id="id103346" href="http://www.xml.com/pub/a/2003/11/05/tr.html">much easier to write</a> with XSLT 2, but that reduces the choice of XSLT processors that you can use. It doesn't matter much from the user's perspective, because it's all under the covers anyway. The key thing is the convenience of creating the DocBook version from OpenOffice with a simple Save As.</p>

<p id="id103359">On the down side, some nested bulleted lists in the original content did not show up in the DocBook version. I found this after converting the eventual DITA version of one of these documents to a PDF file with the DITA Open Toolkit and skimming through the original Word file and the new PDF to do a block-by-block comparison. (I strongly recommend this QA step if you're doing this conversion with important content.) Many bulleted lists got converted to numbered lists as well, although I'm not sure if this was the fault of the Word to DocBook conversion or of a later stage described below. Another small issue is that when the original had more than one space character in a row, all but one got converted to hard spaces to maintain the spacing in XML. I just deleted all the hard spaces from the DITA version with a global replace, but you may want to keep them, depending on how the documents use them. </p>
<p id="id103378">Typical Word users add space between paragraphs by inserting an extra carriage return, instead of adjusting the styles included with document, so your output from this conversion step might have a lot of empty <code id="id103385">para</code> elements. You can delete this with a simple XSLT stylesheet or even a global replace in a text editor.</p>
</div>

<div id="id103391">
  <h2 id="id103394">Adding the images</h2>

<p id="id103398">One annoying detail was that the DocBook files created by OpenOffice lack references to the images. When you save a Word file as an OpenOffice native odt (that is, zip) file, you can see that the content.xml file in there has simple, straightforward references to image files that are also in the zip file. The references look like this:</p>
<pre id="id103407">
&lt;draw:frame draw:style-name="fr1" draw:name="graphics63" 
  text:anchor-type="as-char" svg:width="6.8972in" svg:height="2.6264in" 
  draw:z-index="49"&gt;&lt;draw:image 
  <strong id="id103417">xlink:href="Pictures/10000000000003430000013EC16739CA.png"</strong>
  xlink:type="simple" xlink:show="embed" 
  xlink:actuate="onLoad"/&gt;&lt;/draw:frame&gt;
</pre>
<p id="id103424">(I had created the original image in the Word file by pasting it from somewhere else, so the conversion of each to a standalone png file was a nice bonus.) OpenOffice's Save as DocBook feature doesn't save these image references; the DocBook 4.1.2 version of the above that it creates looks like this:</p>

<pre id="id103433">
&lt;inlinegraphic fileref="embedded:graphics63" 
    width="6.8972inch" depth="2.6264inch"/&gt;
</pre>

<p id="id103439">(Note that DocBook 5 <a id="id103442" href="http://www.docbook.org/tdg/en/html/inlinegraphic.html">deprecates</a> the <code id="id103448">inlinegraphic</code> element.) After no luck tinkering with the sofftodocbookheadings.xsl stylesheet included with OpenOffice to create the DocBook file, I replaced its contents  with an identity transformation to see what it was using as input. It turned out that it wasn't using the original content.xml file mentioned above but some intermediary file that had replaced the <code id="id103458">xlink:href</code> value above with a child element that stored the actual content of the image, like this: </p>
<pre id="id103465">
&lt;draw:image draw:style-name="fr1" draw:name="graphics63"
            text:anchor-type="as-char" svg:width="6.8972inch"
            svg:height="2.6264inch" draw:z-index="49"&gt;
  <strong id="id103474">&lt;office:binary-data&gt;iVBORw0KGgoAAAANSUhEUgAAA0MAAAE+CAIAAADAgVy 
   &lt;!-- lots more data here--&gt;&lt;/office:binary-data&gt;
&lt;/draw:image&gt;</strong>
</pre>
<p id="id103484">At least the <code id="id103486">draw:name</code> value of the <code id="id103491">draw:image</code> element's parent <code id="id103495">draw:frame</code> element gets preserved in the DocBook output as the value of the <code id="id103499">fileref</code> attribute, so instead of digging intp OpenOffice's architecture to see what was preparing the input for sofftodocbookheadings.xsl and trying to fix that, I wrote a <a id="id103503" href="http://www.snee.com/bobdc.blog/files/getImageNameData.xsl">getImageNameData.xsl</a> stylesheet to pull the {<code id="id103511">draw:name</code>, <code id="id103516">xlink:href</code>} pairings from the original content.xml file. Then, I wrote an <a id="id103520" href="http://www.snee.com/bobdc.blog/files/addImageRefs.xsl">addImageRefs.xsl</a> stylesheet to look up the image filenames in the getImageNameData.xsl output and insert them into a new copy of the DocBook file.</p>

</div>

<div id="id103532">
  <h2 id="id103535">DocBook to DITA</h2>

<p id="id103539">Eric Hennum describes a docbook2dita plugin for the DITA Open Toolkit in <a id="id103543" href="http://markmail.org/message/gd4r4elqmmmqcb2w">this posting</a> on a DocBook list. My first attempt to use it from within the DITA Open Toolkit resulted in the errors discussed in a DITA group thread that ends with <a id="id103553" href="http://tech.groups.yahoo.com/group/dita-users/message/14620">this posting</a> from Mark Peters, who came up with a very simple solution: instead of running the conversion as a plugin, just call the XSLT stylesheet included with the plugin directly and tell it where your input is and where the output should go. The basic form of the command line that he shows worked for me.</p>
</div>

<div id="id103568">
  <h2 id="id103570">Testing it</h2>
<p id="id103574">The first test to pass was whether the result was valid to a DITA DTD, and that went fine. The second test was the big one: whether the HTML and PDF created from the document by the DITA Open Toolkit looked right. In general it did, except for the issues described above, which showed that a block-by-block comparison of each PDF with the original Word file is worth the trouble. If I had to do a large amount of these conversions I'd dig deeper into the nested bulleted list and bulleted/numbered list issues in the hopes of reducing the need for this final manual step.</p>
<p id="id103588">So far, though, the automation steps that I found or put together are definitely saving me tons of potential manual work. I only had to do this to a few documents, so I didn't mind executing each step one a time, but if you want to use OpenOffice to convert a large amount of documents, I wrote something in XML.com called <a id="id103596" href="http://www.xml.com/pub/a/2006/01/11/from-microsoft-to-openoffice.html">Moving to OpenOffice: Batch Converting Legacy Documents</a> a few years ago that should help.</p>

</div>

    </div>
]]>
    </content>
</entry>

<entry>
    <title>Simple semi-structured data entry</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/11/simple-semi-structure-data-ent.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.553</id>

    <published>2009-11-12T01:48:55Z</published>
    <updated>2009-11-12T01:55:56Z</updated>

    <summary>With RDF.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="RDF" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="rdf" label="rdf" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        With RDF.
        <![CDATA[    <div id="id103299">

<p id="id103302">When most people want to take notes on a collection of things, and they know that the notes will have some structure but they're not sure about the nature of that structure just yet, they use a spreadsheet. For each thing that they take notes on, they add a new row; for each attribute of the things under review, they add a column. From an investment banker comparing potential investments to a scout leader planning a camping trip, the grid makes it easy for you to compare similar attributes of different things without forcing to you to specify all of your attributes before starting your data entry like a more serious database application would.</p>
<p id="id103317">In theory, RDF is ideal for this, because you can assign any attribute name/value pair to any resource that you can identify with no requirement to plan it all in advance, but in practice, it's rarely as easy as pouring names and numbers into a spreadsheet. I've often thought that it would be fun to build a freeform database program that lets people do data entry and make up new fields as they go along, all with RDF underneath. I even wrote some Python code for this a few years ago, but never followed through. Since joining TopQuadrant, I've wondered about assembling something like this with the company's application development tools, but then I realized that the <a id="id103331" href="http://www.topquadrant.com/products/TB_Composer.html#free">Free Edition</a> of TopBraid Composer pretty much already does this.</p>
<p id="id103341">Here's a use case that's happened to most people in the modern workforce: you're told that you'll be joining a particular project, and to get you started someone emails a zip file of relevant files for you to review. For my notes on these files, I might create a text file or a spreadsheet, but I'd probably assemble an XML file where I made up element names as I went along. These elements would track the filename, document title, author, age, comments, and probably some project-specific fields. When the big picture starting coming into focus, I'd write a little XSLT to convert this XML to presentable HTML to show to others if necessary.</p>

<p id="id103355">A key reason that this would be easy for me is that the Emacs <a id="id103359" href="http://www.thaiopensource.com/nxml-mode/">nxml mode</a> automates much of the work of entering tags and keeping everything well-formed. How would doing it in RDF be better? I could do the same steps as above using <a id="id103369" href="http://www.xml.com/pub/a/2002/10/30/rdf-friendly.html">RDF-Friendly XML</a> and nxml's excellent  handling of RDF/XML, but I'd rather use a form-based interface instead of Emacs. This is where the free edition of TopBraid Composer comes in.</p>
<p id="id103377">The first step is creating an RDF data file with all the easily available file metadata: the name, size, and last modification date for each file. I wrote a simple perl script called <a id="id103383" href="http://www.snee.com/bobdc.blog/files/dir2rdf-pl.txt">dir2rdf.pl</a> to do this; it's simple because it declares a File class and all the properties for that class in the namespace declared for the file. (I also created a slightly more complex perl script called <a id="id103394" href="http://www.snee.com/bobdc.blog/files/dir2nfordf-pl.txt">dir2nfordf.pl</a> which does the same thing but uses existing classes and properties from the <a id="id103402" href="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/">NEPOMUK File Ontology</a>. It's more complex because this ontology has properties based on properties from other vocabularies such as Dublin Core, so editing data with this ontology means pulling in a few layers of other ones.) </p>
<p id="id103411">When you pipe the result of the Windows <code id="id103414">dir</code> command into the simpler perl script, it outputs the property and class definitions for the files and an entry like this for each file: </p>
<pre id="id103419">
  &lt;File rdf:ID='file11' sd:lastModified='2009-10-30T17:05:00'
        sd:fileName='teams.csv' sd:fileSize='164' rdfs:comment=''/&gt;
</pre>
<p id="id103426">Loaded into the free edition of TopBraid Composer, the editing of that "record" looks like this (I've rearranged the combination of screen sections a bit from the default TopBraid "perspective", to use the Eclipse parlance):</p>
<a href="http://www.snee.com/bobdc.blog/img/rdfdataentry1.jpg"><img id="id103434" src="http://www.snee.com/bobdc.blog/img/rdfdataentry1.jpg" alt="TopBraid Composer screen shot" width="500"/></a>
<p id="id103444">I can edit the values on this form, although there's no reason to edit the file name, size, or last modified values. What I'm really going to do is add notes to the rdfs:comment property, as I've already done above, and perhaps add more comment properties for this resource. The really nice part is that I can define new properties in the Properties view on the right&#8212;for example, some project-specific subproperties of rdfs:comment&#8212;drag them onto the form for any of my File resources, and then add values to them, giving me the functional equivalent of adding new columns to a spreadsheet.</p>
<p id="id103460">It's actually better than that, because if I wanted to add three contactWithQuestions names to one of these File resources on a spreadsheet grid, I'd have to either add three columns or string together three values in one spreadsheet cell as if they were one. With RDF, though, I  can define a contactWithQuestions property and then add three separate values for this property to the same resource. Moving beyond the use of simple string data for the values here, I could create object properties (properties where the value is another resource&#8212;in this case, to define relationships between File objects such as mentionedIn or basedOn) by defining them in the Properties view on the right with a range of File. When I want to assign one of these properties to a particular File object, I would drag it from the property list on the right onto the Resource form for that File and then pick out the appropriate file it refers to from a drop-down list. For example, after creating a mentionedIn property, if teams.csv was mentioned in index.html and I wanted to record this in my notes on teams.csv, I'd drag the mentionedIn property onto the Resource Form for teams.csv and select index.html as the value for that property.</p>
<p id="id103464">Because this is a GUI editing interface, I can also add and delete new File resources (the equivalent of inserting and deleting rows on a spreadsheet) by clicking icons on the Instances view at the bottom. (Another nice bonus with TopBraid Composer is the SPARQL tab next to that, where you can enter and run SPARQL queries about the data.)</p>
<p id="id103473">So, I've got my form-driven interface that I can use with any RDF data. I've kept my address book in RDF for a long time; maybe I should try maintaining it like this instead of with Emacs.</p>

    </div>
]]>
    </content>
</entry>

<entry>
    <title>Up and running with Mercurial</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/10/up-and-running-with-mercurial.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.550</id>

    <published>2009-10-26T13:50:12Z</published>
    <updated>2009-10-27T13:51:23Z</updated>

    <summary>Quick and easy.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="miscellaneous" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="mercurial" label="mercurial" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Quick and easy.
        <![CDATA[    <div id="id103275">

<a id="id103278" href="http://mercurial.selenic.com/wiki/"><img id="id103283" src="http://www.selenic.com/hg-logo/logo-droplets-200.png" border="0" align="right" hspace="30px" vspace="30px" width="100px" alt="mercurial logo"/></a>

<p id="id103303">I've used the cvs and svn version control systems for both work-related and personal projects. For personal work, I used svn in particular more as a backup program, with the added benefit of the version control. Keeping my repository on a thumb drive made it easy to perform the backups when traveling, but perhaps because of sloppiness in removing the thumb drive without clicking the right icons first, my repository got corrupted too often, so I gave up. </p>

<p id="id103315">I decided to try again with <a id="id103318" href="http://mercurial.selenic.com/wiki/">Mercurial</a> and was shocked at how quickly I was able to learn it and get it to do everything I wanted&#8212;about an hour. <a id="id103328" href="http://importantshock.wordpress.com/2008/08/07/git-vs-mercurial/">This blog posting</a> convinced me to try it before <a id="id103335" href="http://git-scm.com/">git</a>, which sounds fascinating but a bit more complicated. By keeping my repository on the local drive and using the clone feature to keep backup copies of the repository elsewhere, I can redo a backup if a thumb drive version gets messed up. </p>

<p id="id103346">The <a id="id103349" href="http://mercurial.selenic.com/wiki/QuickStart">Mercurial Quick Start</a> lives up to its name, and I kept some notes as I went along to provide my own Mercurial quick reference:</p>
<table id="id103361" border="1" style="border: 1px solid; border-collapse: collapse;" cellpadding="6px">
<tr id="id103371"><td id="id103372"><tt id="id103374">hg init</tt></td><td id="id103376">Turn current directory into a project.</td></tr>
<tr id="id103381"><td id="id103382"><tt id="id103383">hg add</tt><tt id="id103386"/></td><td id="id103387">Add files in current directory to repository.</td></tr>
<tr id="id103391"><td id="id103393"><tt id="id103394">hg ci -m "comment about this commit"</tt></td><td id="id103397">Commit recent changes to repository.</td></tr>
<tr id="id103401"><td id="id103403"><tt id="id103404">hg clone . e:\otherCopy</tt></td><td id="id103407">Create a clone of the current directory's repository somewhere else.</td></tr>
<tr id="id103412"><td id="id103413"><tt id="id103414">hg push e:\otherCopy </tt></td><td id="id103417">Send recent changes in this directory's repository to a clone repository (that is, back up the changes here to there).</td></tr>
<tr id="id103423"><td id="id103424"><tt id="id103425">update</tt></td><td id="id103428">(entered from within e:\otherCopy directory) Make clone directory's contents reflect recent changes to clone repository.</td></tr>
<tr id="id103434"><td id="id103435"><tt id="id103436">hg log test1.txt</tt></td><td id="id103439">List comments (see -m above) for each of test1.txt file's changes.</td></tr>
<tr id="id103444"><td id="id103445"><tt id="id103446">hg revert -r 1 test1.txt </tt></td><td id="id103449">Revert file test1.txt to revision 1. (You can then "revert" it to later versions.</td></tr>
<tr id="id103455"><td id="id103456"><tt id="id103457">hg cat -r 2 test1.txt </tt></td><td id="id103460">Look at version 2 of test1.txt.</td></tr>
<tr id="id103464"><td id="id103465"><tt id="id103467">hg locate *foo*</tt></td><td id="id103469">List files in repository with "foo" in their names.</td></tr>
</table>
<p id="id103475"><del>One other note: an .hgignore file tells hg files which to ignore, and putting separate .htignore files in subdirectories of your main project directory works fine.</del></p>

<p id="id103479">I once had <a id="id103482" href="http://www.snee.com/bobdc.blog/2006/11/dam-subversion-rdf-owl.html">grand ideas</a> about hooking up a version control system that can assign arbitrary metadata with an RDF triplestore to form the basis of some sort of CMS demo. Mercurial <a id="id103492" href="http://markmail.org/thread/h66comox2nf4koay#query:mercurial%20metadata+page:1+mid:h66comox2nf4koay+state:results">isn't much help here</a>, but when I prioritize the tasks "back up my stuff" and "build a demo CMS around a version control system" the former is clearly much more important. Maybe someday...</p>


    </div>
]]>
    </content>
</entry>

<entry>
    <title>Blogging on TopQuadrant&apos;s Blog</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/10/blogging-on-topquadrants-blog.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.546</id>

    <published>2009-10-14T23:59:02Z</published>
    <updated>2009-10-15T00:01:25Z</updated>

    <summary>In addition to here.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="SPARQL" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="blogging about blogging" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="semantic web" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="topquadrantspin" label="topquadrant spin" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        In addition to here.
        <![CDATA[    <div id="id103277">

<p id="id103279">I just added my first entry to TopQuadrant's blog, <a id="id103283" href="http://topquadrantblog.blogspot.com/">Voyages of the Semantic Enterprise</a>. It's called <a id="id103291" href="http://topquadrantblog.blogspot.com/2009/10/spin-tutorial-available.html">SPIN Tutorial Available</a>, and describes the tutorial I recently finished writing on using the SPARQL Inferencing Notation with TopBraid Composer. </p>
<p id="id103302">I'll be adding more to that blog in the future and certainly continuing with this one here, keeping the entries that focus on TopQuadrant technology over there. I'll put the others&#8212;including more general interest entries on the semantic web, SPARQL, and RDF&#8212;right here. </p>
<p id="id103312">Thanks for reading either!</p>


    </div>
]]>
    </content>
</entry>

<entry>
    <title>A rules language for RDF</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/10/a-rules-language-for-rdf.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.544</id>

    <published>2009-10-01T16:45:05Z</published>
    <updated>2009-10-01T16:48:31Z</updated>

    <summary>Right under our noses.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="SPARQL" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="sparqlowlspinrdf" label="sparql owl spin rdf" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Right under our noses.
        <![CDATA[
    <div id="id103276">

<p id="id103279">Last May, in <a id="id103282" href="http://www.snee.com/bobdc.blog/2008/05/adding-semantics-to-make-data.html">Adding semantics to make data more valuable: the secret revealed</a>, I showed how storing a little bit of semantics about the word "spouse"&#8212;the fact that it's a symmetric property (that is, that if A is the spouse of B, then B is the spouse of A)&#8212;let me look up someone's home phone number in my address book even if my entry for him there lacks his home phone number. I like this story because unlike biotech and some of the other popular domains for Semantic Web technology, everyone has an address book and understands the basic properties of an entry: first name, last name, email address, and so forth.  (Because so many people have lived through the annoyances of moving their contact information from one email client or phone to another, address books also provide nice use cases for data integration issues.)</p>

<p id="id103327">Back then, I wrote:</p>
<blockquote id="id103331">
With software that understands an OWL expression stating that <code id="id103335">spouse</code> is a symmetric property and a rule I define to say that spouses have the same home phone number, I can retrieve Leroy's home phone number...
</blockquote>
<p id="id103340">OWL is great for defining the symmetry, but I glossed over the part about defining the fact that spouses have the same phone number. How do you define such a rule? n3 has a <a id="id103345" href="http://www.w3.org/2000/10/swap/doc/Rules">rules language</a>, but I haven't seen it used much as the n3 subset known as <a id="id103354" href="http://www.w3.org/TeamSubmission/turtle/">Turtle</a> (which leaves out such things) becomes more popular. Instead of defining a Semantic Web rules language, the W3C has decided to have the <a id="id103363" href="http://www.w3.org/2005/rules/wiki/RIF_Working_Group">Rules Interchange Format Working Group</a> standardize an interchange format between the <a id="id103372" href="http://www.w3.org/2005/rules/wg/wiki/List_of_Rule_Systems">many rules languages</a> out there. (The <a id="id103380" href="http://ontolog.cim3.net/file/resource/presentation/ChrisWelty_20080612/W3C-Rules-Interchange-Format--ChrisWelty_20080612.ppt">W3C Rules Interchange Format Basic Logic Dialect</a> PowerPoint presentation by WG co-chair Chris Welty provides good historical background.) </p>

<blockquote id="id103392" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103402">I can write a query that generates the triples I want to infer and call this query a "rule", but what do I do with it? </strong></blockquote>

<p id="id103408">I've used a proprietary RDF rules language before, and was wondering if a standard one would come along. Some colleagues at TopQuadrant have shown me that we all have a straightforward, standardized RDF rules language right under our noses: SPARQL. I've been appreciating SPARQL's CONSTRUCT form <a id="id103416" href="http://www.snee.com/bobdc.blog/2009/09/appreciating-sparql-construct.html">more lately</a>, and CONSTRUCT is the key here: like a SELECT statement, a CONSTRUCT statement defines conditions about which pieces of which triples to retrieve, but unlike SELECT, a CONSTRUCT statement assembles these into new triples. If we view a CONSTRUCT statement as the definition of a rule and the resulting new triples as the result of the execution of the rule, then we have a rules language and plenty of implementations of it available. </p>
<p id="id103432">For example, the following SPARQL "rule" says that if <code id="id103436">?person1</code> has the spouse <code id="id103440">?person2</code> and the home telephone number <code id="id103445">?phoneNum</code>, then <code id="id103449">?person2</code> also has the home telephone number <code id="id103454">?phoneNum</code>:</p>

<pre id="id103459">
PREFIX  : &lt;http://www.snee.com/ns/demo#&gt;
PREFIX v: &lt;http://www.w3.org/2006/vcard/ns#&gt;

CONSTRUCT { ?person2 v:homeTel ?phoneNum . }
WHERE {
  ?person1 :spouse   ?person2 ;
           v:homeTel ?phoneNum .
}
</pre>
<p id="id103473">When run with the following data (for the purposes of this demo, assume that the {:leroy :spouse :loretta} triple was generated by an OWL reasoner that saw {:loretta :spouse :leroy} and knew that :spouse was symmetrical),</p>
<pre id="id103480">
@prefix  : &lt;http://www.snee.com/ns/demo#&gt; .
@prefix v: &lt;http://www.w3.org/2006/vcard/ns#&gt; .
:loretta :spouse   :leroy ;
         v:homeTel "434-923-9321" .
:leroy   v:workTel "434-932-5329" ;
         :spouse   :loretta .
</pre>
<p id="id103497">It generates the triple {:leroy v:homeTel "434-923-9321"}. </p>

<p id="id103502">OK, so I can write a query that generates the triples I want to infer and call this query a "rule", but what do I do with it? What makes it a rule about a particular set of data?</p>

<p id="id103508">Holger Knublauch, a co-worker of mine who designed and developed the OWL plugin for <a id="id103512" href="http://protege.stanford.edu/">Protégé</a> before coming to TopQuadrant, recently wrote an RDF vocabulary called SPIN ("SPARQL Inferencing Notation"), which&#8212;among other things&#8212;can express associations between these rules and classes. So, for example, if the blank node rdf:_1 pointed to the query above, the following triple would associate this query rule to the v:Address class: </p>

<pre id="id103535">
  v:Address spin:rule rdf:_1
</pre>
<p id="id103540">To make the storage of the SPARQL rule in a triplestore even cleaner, Holger has implemented a way to <a id="id103542" href="http://spinrdf.org/sp.html">store SPARQL queries as triples</a>, and he's written the code to roundtrip between this and the standard text version. (See the <a id="id103551" href="http://sparqlpedia.org/spinrdfconverter.html">SPARQL Text to SPIN RDF Syntax Converter</a> for an online converter, and see <a id="id103559" href="http://www.spinrdf.org/">spinrdf.org</a> for more about what else the SPIN vocabulary can do, especially his blog entries as he developed it. I'm now finishing up a tutorial for the use of SPIN features in TopQuadrant products, and except for one optional step of the tutorial, it all works with the free version of TopBraid Composer.)</p>

<p id="id103572">When you take it a little further, symmetrical properties and many other parts of OWL can also be implemented with SPARQL queries, and there's a lot going on among those who are doing this to find a sweet spot between RDFS and OWL Full that meets typical business needs without using a lot of processing power or dollars.</p>


    </div>
]]>
    </content>
</entry>

<entry>
    <title>Converting wpl playlists to m3u playlists</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/09/converting-wpl-playlists-to-m3.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.543</id>

    <published>2009-09-19T02:25:15Z</published>
    <updated>2009-09-19T02:26:54Z</updated>

    <summary>Simple XML in, simple text out, but no good search results for wpl2m3u? Write a little XSLT.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="XSLT" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="music" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="playlistswplm3uxslt" label="playlists wpl m3u xslt" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Simple XML in, simple text out, but no good search results for wpl2m3u? Write a little XSLT.
        <![CDATA[    <div id="id103277">

<blockquote id="id103280" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103291">After taking a closer look at the WPL format I realized that an XSLT stylesheet to convert it to M3U would be very simple.</strong></blockquote>

      <p id="id103297">I've switched around between music-playing programs over the last few years. I suppose I should call them "media players", but I only use them to play music, which is part of the reason I ended up using <a id="id103303" href="http://getsongbird.com/">Songbird</a>, an open source Windows/Linux/Mac music front end that doesn't pretend to be anything else. It looks a bit like iTunes, without all the ads in your face; how great is that? </p>
      <p id="id103314">Before that I used <a id="id103316" href="http://www.mediamonkey.com/">MediaMonkey</a>, and before that, the Windows Media Player. Guess which of these uses the most standardized, XML-based format for playlists? Surprise: the Microsoft one. </p>
      <p id="id103327">Windows Media Player can create WPL files, which seem to conform to the W3C <a id="id103331" href="http://www.w3.org/TR/REC-smil/">SMIL</a> standard, and it can export M3U files, which MediaMonkey uses. To convert WPL files to m3u for Songbird, reading them individually into Windows Media Player and exporting them one at a time was annoying. I did some web searches for wpl2m3u and only found one script that I couldn't quite follow, and after taking a closer look at the WPL format I realized that an XSLT stylesheet to convert it to M3U would be very simple. So here it is:</p>
<pre id="id103346">
&lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:strip-space elements="*"/&gt;
  &lt;xsl:output method="text"/&gt;

  &lt;xsl:template name="textAfterLastSlash"&gt;&lt;!-- but actually backslash --&gt;
    &lt;xsl:param name="string"&gt;dummy string&lt;/xsl:param&gt;
    &lt;xsl:choose&gt;
      &lt;xsl:when test="not(contains($string,'\'))"&gt;
        &lt;xsl:value-of select="$string"/&gt;
      &lt;/xsl:when&gt;
      &lt;xsl:otherwise&gt;
        &lt;xsl:call-template name="textAfterLastSlash"&gt;
          &lt;xsl:with-param name="string" select="substring-after($string,'\')"/&gt;
        &lt;/xsl:call-template&gt;
      &lt;/xsl:otherwise&gt;
    &lt;/xsl:choose&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="smil"&gt;
    &lt;xsl:text&gt;#EXTM3U&amp;#10;&lt;/xsl:text&gt;
    &lt;xsl:apply-templates/&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="media"&gt;
    &lt;xsl:text&gt;#EXTINF:0,&lt;/xsl:text&gt;
    &lt;xsl:call-template name="textAfterLastSlash"&gt;
      &lt;xsl:with-param name="string" select="@src"/&gt;
    &lt;/xsl:call-template&gt;
    &lt;xsl:text&gt;&amp;#10;&lt;/xsl:text&gt;
    &lt;xsl:value-of select="@src"/&gt;
    &lt;xsl:text&gt;&amp;#10;&amp;#10;&lt;/xsl:text&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="title"/&gt;

&lt;/xsl:stylesheet&gt;
</pre>

<p id="id103383">It's not very long, but if you want fancy XSLT, I have a recursive named template, which I wrote for something else but modified here to look for the text after the last backslash. The &amp;#10; is a trick I've used more lately to get XSLT to output a carriage return, because if I put an actual carriage return inside of an xsl:text element like I always did before, telling Emacs to re-indent the whole thing tends to screw that up. </p>

<p id="id103408">With a long plane ride tomorrow night to go to Oxford for the <a id="id103412" href="http://www.xmlsummerschool.com/">XML Summer School</a>, I want to load up the MP3 player with something conducive to sleeping, so I just converted my playlist of <a id="id103421" href="http://en.wikipedia.org/wiki/Lata_Mangeshkar">Lata Mangeshkar</a> ballads so that I can put that on. (If you like classic Bollywood soundtracks, check out <a id="id103429" href="http://thirdfloormusic.blogspot.com/">Music from the Third Floor</a>; if you're new to it and interested, start with the <a id="id103438" href="http://thirdfloormusic.blogspot.com/search/label/Compilations">compilations</a> there.)</p>

    </div>
]]>
    </content>
</entry>

<entry>
    <title>Appreciating SPARQL CONSTRUCT more</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/09/appreciating-sparql-construct.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.540</id>

    <published>2009-09-09T23:33:39Z</published>
    <updated>2009-09-09T23:36:18Z</updated>

    <summary>Another way to get more out of your data.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="SPARQL" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="sparqldigg" label="sparql digg" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Another way to get more out of your data.
        <![CDATA[    <div id="id103288">

<p id="id103290">As with SQL, SPARQL's most popular verb is SELECT. It lets you request the data you want from a collection, whether you're asking for a single phone number or you want a list of first and last names and phone numbers of all employees hired after January 1st, sorted by last name.</p>

<blockquote id="id103299" class="pullquote" style="width: 190px; font: bold 1.333em/1.125em &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; margin: 1.5em 0 1.5em 1.5em !important; padding: 0.6em 5px !important; background: none !important; border: 3px double #ddd; border-width: 3px 0; text-align: center; float: right; "><strong id="id103310">CONSTRUCT provides a nice example of how SPARQL is more than a query language; along with extracting data using queries, you can create useful new data as well.</strong></blockquote>

<p id="id103314">In SPARQL, SELECT is actually known as a <a id="id103317" href="http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/#QueryForms">query form</a>, and another is CONSTRUCT. According to the <a id="id103324" href="http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/">SPARQL Query Language for RDF</a> W3C Recommendation, CONSTRUCT returns a graph&#8212;a set of triples. I had thought of CONSTRUCT as a way of pulling a set of triples out of a triplestore, especially a remote triplestore, but while reviewing some TopQuadrant training material I realized how handy CONSTRUCT can  be to create useful new triples.</p>
<p id="id103345">For example, let's say you have the following triples written in Turtle syntax to identify the gender and parent/child relationships of a few people:</p>

<pre id="id103352">
@prefix : &lt;http://www.snee.com/ns/demo#&gt; .

:jane :hasParent :gene .
:gene :hasParent :pat ;
      :gender    :female .
:joan :hasParent :pat ;
      :gender    :female . 
:pat  :gender    :male .
:mike :hasParent :joan .
</pre>

<p id="id103366">The following CONSTRUCT statement creates new triples based on the ones above to specify who is who's grandfather:</p>
<pre id="id103372">
PREFIX : &lt;http://www.snee.com/ns/demo#&gt; 

CONSTRUCT { ?p :hasGrandfather ?g . }

WHERE {?p      :hasParent ?parent .
       ?parent :hasParent ?g .
       ?g      :gender    :male .
}
</pre>
<p id="id103384">When I ran this query with the data above, <a id="id103388" href="http://jena.sourceforge.net/ARQ/">ARQ</a> returned the newly constructed triples in Turtle format:</p>

<pre id="id103397">
@prefix :        &lt;http://www.snee.com/ns/demo#&gt; .

:jane
      :hasGrandfather  :pat .

:mike
      :hasGrandfather  :pat .
</pre>
<p id="id103403">From the same little data file, we can generate triples about who is who's aunt:</p>
<pre id="id103408">
PREFIX : &lt;http://www.snee.com/ns/demo#&gt; 

CONSTRUCT { ?p :hasAunt ?aunt . }

WHERE {?p      :hasParent ?parent .
       ?parent :hasParent ?g .
       ?aunt   :hasParent ?g ;
               :gender    :female .

FILTER (?parent != ?aunt)  
}
</pre>

<p id="id103423">With this query, ARQ constructs these triples:</p>
<pre id="id103427">
@prefix :        &lt;http://www.snee.com/ns/demo#&gt; .

:jane
      :hasAunt      :joan .

:mike
      :hasAunt      :gene .
</pre>
<p id="id103433">This isn't really creating new information, but the ability to make implicit information explicit can certainly add value to a system, especially when the rules necessary to assemble the pieces are more complicated than the ones shown above for identifying grandfathers and aunts. </p>
<p id="id103442">How you use your newly constructed triples depends on how your SPARQL engine gives them to you. As we saw above, ARQ writes them out in Turtle syntax. TopQuadrant's TopBraid Composer displays them in the window used for SPARQL query output, and after you select one or more of them, the "Assert selected constructed triples" menu choice adds them to the graph of triples that you're currently working with. (This works in the <a id="id103459" href="http://www.topquadrant.com/products/TB_Composer.html#free">free edition</a> as well.)</p>

<p id="id103468">CONSTRUCT provides a nice example of how SPARQL is more than a query language; along with extracting data using queries, you can create useful new data as well.</p>



    </div>
]]>
    </content>
</entry>

<entry>
    <title>Growth of the linked data cloud</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/09/growth-of-the-linked-data-clou.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.539</id>

    <published>2009-09-03T13:19:24Z</published>
    <updated>2009-09-03T13:23:32Z</updated>

    <summary>Or at least, the growth of Richard Cyganiak&apos;s famous diagram.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="semantic web" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="linkeddata" label="linkeddata" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        Or at least, the growth of Richard Cyganiak&apos;s famous diagram.
        <![CDATA[    <div id="id103274">

<p id="id103276">While preparing slides for the Semantic Web Overview talk I'll be giving at the beginning of the <a id="id103279" href="http://xmlsummerschool.com/curriculum2009/semantic-technologies/">Semantic Technologies course</a> of the <a id="id103286" href="http://xmlsummerschool.com/">Oxford XML Summer School</a>, I was adding a few slides on Linked Data. (Leigh Dodds is presenting a more detailed class on Linked Data later in the day.) Of course I had to include a slide of Richard Cyganiak's interactive diagram of the Linked Data cloud, and as with many of my slides, I was tempted to re-use a slide from a presentation I'd given before. I found the following image in a talk I gave in February of last year: </p>

<img id="id103301" src="http://www.snee.com/bobdc.blog/img/ldcFeb08.jpg" alt="[linked data cloud, February 2008]" width="440px"/>

<p id="id103312">I decided to be conscientious and update the image, so I went to <a id="id103315" href="http://richard.cyganiak.de/2007/10/lod/">Richard's page</a> to get an updated version, and found this:</p>

<img id="id103325" src="http://www.snee.com/bobdc.blog/img/ldcJul09.jpg" alt="[linked data cloud, July 2009]" width="440px"/>

<p id="id103335">It looks like the world of linked data is growing at quite a rate! And, if you look closely, you'll see that his latest image says "As of July 2009", so I imagine that there are even more nodes to add to this image by now. </p>
</div>
]]>
    </content>
</entry>

<entry>
    <title>Getting started with the TopQuadrant product line</title>
    <link rel="alternate" type="text/html" href="http://www.snee.com/bobdc.blog/2009/08/getting-started-with-the-topqu.html" />
    <id>tag:www.snee.com,2009:/bobdc.blog//2.537</id>

    <published>2009-08-28T00:38:34Z</published>
    <updated>2009-08-28T00:41:20Z</updated>

    <summary>A lot of great technology to learn about.</summary>
    <author>
        <name>Bob DuCharme</name>
        <uri>http://www.snee.com/bobdc.blog</uri>
    </author>
    
        <category term="SPARQL" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="semantic web" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="topquadrantrdf" label="topquadrant rdf" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.snee.com/bobdc.blog/">
        A lot of great technology to learn about.
        <![CDATA[    <div id="id103271">

<p id="id103274">Last week was my first week working at <a id="id103277" href="http://www.topquadrant.com/">TopQuadrant</a>, and I spent three days in a class given by one of my new co-workers, Scott Henninger. I only had a skeletal idea of what the components of <a id="id103284" href="http://www.topquadrant.com/products/TB_Suite.html">TopBraid Suite</a> did before, and now that I have a better idea, I'm very impressed. (I may be wrong on one or two details below, but I'm still the new guy.)</p>

<p id="id103295">I had the impression that the core original product, TopBraid Composer, was mostly for designing and editing RDFS/OWL schemas and ontologies. It's very good at doing those, but it also makes a very good interface for dealing directly with the data described by these models. Being built on Eclipse, the various panes (or, in Eclipse parlance, "views") of the main window let you see an ontology or a file of data from several angles at once and refine the model by pointing, clicking, dragging, and editing dialog boxes.</p>

<p id="id103308">TopBraid Composer also includes a SPARQL engine and uses standard SPARQL as the starting point for several new technologies that let you build applications around triplestores. A great new one is SPIN, for "SPARQL Inferencing Notation". As described on <a id="id103314" href="http://www.spinrdf.org/">spinrdf.org</a>, </p>
<blockquote id="id103322">
SPIN is a collection of RDF vocabularies enabling the use of SPARQL to define constraints and inference rules on Semantic Web models. SPIN also provides meta-modeling capabilities that allow users to define their own SPARQL functions and query templates. Finally, SPIN includes a ready to use library of common functions.
</blockquote>
<p id="id103337">As TopQuadrant VP of Product Development Holger Knublauch wrote in a comment in a recent <a id="id103341" href="http://stage.vambenepe.com/archives/496">William Vambenepe blog entry</a>,</p>
<blockquote id="id103350">
Another aspect of RDF that SPIN rides on is the vision of a distributed self-describing data structure. In the Semantic Web, both classes and instances live in the same space and can be queried using the same mechanisms. SPIN takes this idea to extremes: you can not only define classes and properties, but even define executable semantics of those and use this mechanism to build your own modeling languages.
</blockquote>
<p id="id103367">Holger's own blog entry <a id="id103370" href="http://composing-the-semantic-web.blogspot.com/2009/01/object-oriented-semantic-web-with-spin.html">The Object-Oriented Semantic Web with SPIN</a> is a good introduction to what SPIN (and TopQuadrant's implementation of those executable semantics, TopSPIN) are all about. With support for SPIN built into the <a id="id103377" href="http://www.topquadrant.com/products/TB_Composer.html#free">free edition</a> of TopBraid Composer, a lot of people can now try this out, and I look forward to helping the company beef up the documentation for it.</p>

<a id="id103388" href="http://www.topquadrant.com/products/SPARQLMotion.html"><img id="id103394" src="http://www.topquadrant.com/images/sparql_examples/SPARQLMotion-Example.jpg" border="0" align="right" hspace="10px" vspace="10px" alt="[SPARQLMotion screen shot]" width="320px"/></a>

<p id="id103415"><a id="id103416" href="http://www.topquadrant.com/products/SPARQLMotion.html">SPARQLMotion</a> is another impressive RDF application development productivity tool. SPARQLMotion lets you build applications by dragging icons into a screen where you connect them into pipelines that can branch in different directions depending on various conditions. You configure each icon by filling out a dialog box to point to data sources, data destinations, and processing modules. Input modules represented by different icons can pull data from news feeds, spreadsheets, email, XML, all the obvious RDF sources, and more. Processing can apply rules via Pellet, Jena rules, TopSPIN, Calais, XSLT... that's about a quarter of the list. You can then output the results of your processing to most of the input formats and additional ones such as calendars, maps, and HTTP POST requests. (PDF support is on the way, so that you could have the XSLT processing module convert XML versions of data pulled by some SPARQL queries into XSL-FO of a nicely rendered page and then output a PDF file from there.) Holger has done a nice five-minute video called "Creating a SPARQLMotion Script" on <a id="id103459" href="http://www.topquadrant.com/resources/videos.html">TopQuadrant's video page</a>.</p>

<p id="id103469">After TopBraid Composer, the other two components of TopBraid Suite are Ensemble and Live. TopBraid Ensemble lets you create applications by selecting user interface components and essentially writing event handlers for them. Components for displaying data  include trees, grids, and forms, so a dashboard app would be pretty straightforward to build. Because it's built on Adobe Flex, you can create any Flex component you want, such as a movie player, and then use the Ensemble API to grab triples and use them in processing. (I never realized that a running copy of Eclipse has an <a id="id103481" href="http://www.eclipse.org/jetty/">HTTP server</a> that you can use as the basis for applications.) Because the UI that you design can trigger manipulation of the data in a triplestore using SPIN and SPARQLMotion, you can build complete applications around triplestores for use by people who may not even know what RDF is but who need to work with that data using a form-driven interface.</p>

<p id="id103495">Once you build an application with Ensemble, TopBraid Live lets you deploy it on a server for others to use. I saw Scott help a customer deploy an app, and the process pretty much looked like zipping up some files and then unzipping them to the right place on a server that the app's users would have access to.</p>

<p id="id103504">With SPARQLMotion as a development tool and TopBraid Live as a deployment tool, it's easy to picture an information publisher having staff members who do nothing but full-time SPARQLMotion development, creating apps that mix and match data from all the different data sources available to that publisher in order to build information products and applications around those data sources. (The data might be available as native RDF, but would more likely be in a host of other formats available to the SPARQLMotion scripts using its automatic converters.) Using TopBraid Live, the publisher would use these apps to deliver content in any format necessary to their customers. The publisher would have an agile platform for creating new information products whose components may have started off in separate silos and would have taken a lot more work to integrate without TopBraid Ensemble. Of course, there's more to it than the easier integration provided by the RDF data model; the possibilities that RDFS, OWL, and now SPIN provide for adding metadata to the content should be very attractive to publishers as well.  </p>


    </div>
]]>
    </content>
</entry>

</feed>
