Putting semantics on the web

Painlessly adding RDF-compatible semantics to XHTML.

University of Maryland semantic web researcher Jim Hendler (XML 2005 attendees will remember his keynote speech in Atlanta) closed a recent mindswap weblog entry by writing this:

I worry that most of the Semantic Web community is doing work in Semantics, most of the rest are looking at Web apps, and hardly anyone is actually looking at the "Semantic Web" that I really care about...

I've felt for a while that too much work is about building ontologies and schemas and that not enough is about creating actual instance data to use. Imagine if, in the early days of XML, everyone spent most of their time creating DTDs and assumed that others would create XML documents to conform to their DTDs, while those others were just creating more DTDs themselves.

More semantic web data would give the people doing semantic web research more opportunities to create interesting applications, and the W3C's RDF in XHTML Taskforce, chaired by Ben Adida, is doing some great work here. While their earlier work focused on convincing the RDF crowd of the value of this RDF-XHTML hybrid, recent drafts of the RDF/A Primer take on the bigger task of demonstrating this value to people on the other side of the RDF-XHTML fence. For example, the Primer demonstrates how adding a few new attributes here and there in your existing XHTML lets automated programs pull out your contact information and update the departmental directory or pull picture metadata from your web page about your recent trip and add it to your central database of photo information.

In a posting today on Cologne's 2nd Web Montag, Benjamin Nowack writes that a common question about the semantic web is "'where is the connection to my HTML pages?' (link between the clickable web and the semantic web)." I think we have our answer in RDF-XHTML. Airlines and movie theater chains are already creating web versions of the data we want, and a few new attributes in that HTML will be a lot less trouble for them than creating parallel RDF/XML files or most other options for offering machine-readable semantic data on the web. I look forward to the new possibilities that this opens up for both developers and consumers.


perhaps an alternative can be found in the duo GRDDL / microformats. It is another approach, perhaps less expressive, but with practical implications which make a direct link with the classical web.

There's definitely good work going on with GRDDL as well, but while it will be possible for different websites to use a shared library of stylesheets, the XSLT stylesheet used for extraction is another moving part for each web site to contend with.

I think that RDF/A takes a rather microformat approach itself.