Storing information about the meaning of terms—their "semantics"—can make data more valuable. Critics of semantic web technology consider such talk to be pie-in-the-sky AI talk; how can you encode the real meaning of words? More importantly, how can you do it in a way that programs can read and use to solve real data problems?
The answer is very simple: you don't have to encode all of a term's semantics to get value from the standards and software used to do so. Let's look at an example.
What are the semantics of the word "spouse"? What does it mean to a recently engaged nineteen-year-old girl? What does it mean to a fifty-year-old man who's been divorced three times? What does it mean in a court of law in California, Mississippi, Austria, or Thailand?
That's a lot of meaning to store, but we don't need to store much to make a simple, mundane database such as an address book more valuable. Let's say my address book includes the following facts, and I want Leroy's home phone number:
Leroy has a work phone number of 212-334-4323.
Leroy has an email address of firstname.lastname@example.org.
Loretta has an email address of email@example.com.
Loretta has a home phone number of 718-928-6621.
Loretta's spouse is Leroy.
The only information I have about Leroy is his work number and his email address. I don't have his home number or any information about his spouse.
The W3C OWL web ontology language lets us declare that a property is symmetric, or as the OWL overview puts it, "if the pair (x,y) is an instance of the symmetric property P, then the pair (y,x) is also an instance of P." With software that understands an OWL expression stating that spouse is a symmetric property and a rule I define to say that spouses have the same home phone number, I can retrieve Leroy's home phone number from the little "database" above. (More likely, I would define a "roommate" property as symmetric and a rule saying that roommates have the same home phone number, and then declare spouse to be a subproperty of roommate, but you get the idea.) By doing this, I'd be using the OWL rules to let me pull more information out of the data collection than I put into it, making the data collection more valuable.
Plenty of software claims to make this kind of thing possible, but what interests me in OWL and related standards is the fact that they're standards, so that if I use OWL syntax to say "spouse is a symmetric property," a range of commercial and free software can understand and use that little bit of semantics that I've stored to help me get more work done.
It's easiest to demonstrate this with data stored using an RDF syntax, because the RDF data model has the closest fit to the subject/attribute-name/attribute-value statements in my little database above. If you prefer, though, more and more tools can keep the RDF part under the covers; my XML 2006 paper Relational database integration with RDF/OWL describes a related demo using address book data stored in MySQL tables. It shows a few more use cases of realistic questions to the database that get better answers because of semantics added using OWL.
There's a lot we can do with this technology...