When people talk about semantic technology, they're often talking about technology that has nothing to do with semantics. They're talking about the new possibilities that the RDF data model and the SPARQL query language add to distributed database applications, and there's a lot to talk about. As Jim Hendler once wrote,
My document can point at your document on the Web, but my database can't point at something in your database without writing special purpose code. The Semantic Web aims at fixing that.
Why do we describe technology for easier integration of machine-readable data on the web as "semantic"? I don't mean to pick on Jim—I had the quote handy because it's in my file of favorite quotes, and few understand the semantic add-ons to Linked Data that will make for a proper Semantic Web better than he does—but I don't see semantics necessarily playing much role in the technology evolving to let web databases easily point at each other. There are some semantics built into the middle third of all RDF triples, because the requirement that a predicate use a full URL means that I can't just say "title" there, leaving you to wonder whether I'm talking about a job title, the deed to a piece of property, or the title of a work; I have to say something like http://purl.org/dc/elements/1.1/title to make it clear that I mean the title of a work. In other words, I must make the semantics of the triple's predicate clear.
There is plenty of payoff when applications can combine data from different sources to do things with no need for a central schema tying them together, and this is possible without any program logic addressing the semantics of that data.
Other than that, I don't see what's semantic about exposing data as triples and using SPARQL to get at it as described by Tim Berners-Lee's original essay on Linked Data principles, except that the general ideas are an outgrowth of the older idea of the Semantic Web. We're seeing now that as more data gets exposed and linked this way, more and more possibilities open up. Once enough data is linked using this technology, then there will be enough to work with to start making general-purpose semantic applications, but until then, the use of OWL and related technologies that really address semantics will be limited to niches. Companies such as TopQuadrant and Clark & Parsia are doing very interesting work in those niches, and they're blazing the trails for when the broader information technology and publishing worlds are ready to take advantage of the semantics of this linked data. (In a recent Semantic Web gang podcast, someone said that new technology traditionally moves from NASA to the military to corporations to independent end users, and that we're seeing the reverse with Semantic Web adoption. I guess he didn't know that NASA is a client of both TopQuadrant and Clark & Parsia.)
While Zepheira's web site certainly uses the word "semantic" a lot, they seem more focused on linked data technologies as they focus on helping their clients "integrate, navigate and manage information across personal, group and enterprise boundaries." I think that this is a better place for most developers to focus on, at least for now, because there's a better chance of a medium- and even short-term payoff. That's the data infrastructure that actual semantic technologies can build on, so for now let's focus on the value of the infrastructure: data exposed (either publicly or behind the firewall across internal enterprise boundaries, which I believe is where Zepheira's been helping a lot of clients) in a standard way so that the growing number of tools built around those standards can take advantage of that data. This is just what the organizations in the Linking Open Data dataset cloud have been doing. There is plenty of payoff when applications can combine data from different sources to do things with no need for a central schema tying them together, and this is possible without any program logic addressing the semantics of that data.
Of course the real semantic technologies such as OWL and inferencing engines build on that, so this will bring even cooler applications. Nevertheless, to evangelize the data infrastructure that this will build on and to allay the fears of enterprise IT people who remember pie-in-the-sky AI promises when they hear the word "semantic", telling them about Semantic Web technology without the semantic parts (a.k.a. Linked Data) looks like an easier sell to me.
Comments? Corrections? Is the full URL in predicates enough to say that any use of RDF triples qualifies as semantic technology? (If anyone tells me that I'm misunderstanding the term "semantics", I'll be tempted to say "well, that's just semantics", so be forewarned.)