Earley & Associates is one of the biggest names in taxonomy development, and founder Seth Earley will be giving a talk on Building a Practical Semantic Framework: The role of taxonomies and controlled vocabularies in data integration at the Linked Data Planet conference next week. My recent reading makes the world of taxonomy development look a lot more mature than the ontology development that plays such a significant role in the semantic web, especially in terms of identifying concepts and relationships in a way that helps businesses achieve specific goals. I interviewed Seth via email to learn more about his company and their relationship to the burgeoning world of Linked Data techniques and practices. (As a side note about taxonomies and Linked Data, I recently learned from Kingsley Idehen's blog about a very interesting Linked Data application of one of the most important taxonomies in the US: the Library of Congress Subject Headings. If you follow the links in his bulleted list, remember to do a View Source on them.)
1. Tell me a little about your company.
Earley & Associates delivers consulting and applications development services that help companies leverage internal expertise and knowledge creating capabilities. We specialize in:
- Enterprise taxonomy development
- Content management & Knowledge management
- Technology advisory
- Search strategy & integration
- Change management & governance
- Training & workshops
We are a small company of around 15 full time consultants but we work with all sizes and types of organizations. Some of our recent clients include:
- The Hartford
- The Ford Foundation
- Hasbro Inc.
- The Coca Cola Company
We are recognized within the industry as thought leaders and many of our consultants speak regularly at conferences and workshops including:
- Enterprise Search Summit
- Enterprise3 Portals, Collaboration & Content
- Taxonomy Bootcamp
- KM World & Intranets
We also maintain a regular CoP call series covering a diverse range of topics from search, taxonomy & metadata to usability testing and web analytics.
"The most important aspect of the question is deciding what the real application of either taxonomy or ontology will be, and making sure you have the metrics in place to be able justify the effort it takes to develop either one."
2. What does the idea of Linked Data mean to you?
I think Linked Data is really an extension of concepts and questions that we have been dealing with in the information management field for years. Which is to say, how can we make meaningful connections between the information that we use to do our work? How can we understand it within a context?
In the case of Linked Data, we are attempting to expand this notion of connections or linking from strictly web pages and documents to structured data and other types of resources that can be represented through RDF, and making those connections explicit.
3. What can Linked Data practices and technologies bring to the challenges that Earley & Associates clients are facing?
For the most part our clients have come to recognize the incredible challenge of creating a shared semantic framework within their organization. In this case, we understand the term semantic, not in reference to the semantic web, but in relation to a controlled vocabulary that has a particular meaning to an organization and the content it manages.
In our experience most organizations are not at the level of IM maturity that linked data practices are really relevant to their current needs.
That being said, there is incredible potential for linked data technology to create a richer information environment both on the semantic web and in the organization. The explicit nature of the links made using RDF certainly present a new level of granularity in defining the relationships of one item of content to another.
4. How would you distinguish "Linked Data" projects from "Semantic Web" projects? Or would you?
I suppose it’s possible to invest in linked data projects that are enterprise focused, in that the information lives outside the semantic web behind a firewall. However, the main driver around the creation of linked data is to build the semantic web and create links between disparate data sources. I think the business case is really still in its early stages.
5. Semantic Web discussions often bring up the role of ontologies. Is it possible to differentiate between the potential roles of taxonomies and ontologies in Linked Data and/or Semantic Web efforts?
The line between what is possible to represent with taxonomy and what is possible to represent in an ontology is a fuzzy area. Taxonomies, in a traditional sense, are solely hierarchical in nature, representing a general to specific relationship, whereas an ontology is capable of representing a much larger range of relationships.
However, in our work with clients developing taxonomies, the inclusion of polyhierarchical relationships, as well as reciprocal "see also" relationships, have become commonplace. Now these types of relationships certainly fall outside of the most traditional taxonomy definitions but also fall short of the complexity that can be modelled with RDF and OWL.
I think the most important aspect of the question is deciding what the real application of either taxonomy or ontology will be, and making sure you have the metrics in place to be able justify the effort it takes to develop either one.
6. Some enterprises have already invested in taxonomies. How can they leverage this in Linked Data projects?
This really comes down to the nature of the taxonomy itself. Proponents of the semantic web recommend the use of standard vocabularies (e.g. FOAF SIOC DOAP, etc.) for representing content.
If the taxonomy that an organization has already invested in is representing a very specific and organization-centric domain of information, their may be a lot of work required to align it with standardized vocabularies recommended for the semantic web.
Again, I think it comes down to planning and alignment of effort with an overall information strategy. Anytime you decide to describe a piece of information so that it can be shared, you enter a highly charged and political world. Building a taxonomy is as much about understanding people as it is content. If that understanding can be shared through a linked data project, then great. However I would suggest that a key priority of most organizations is still understanding what the value and meaning of their own content is to them.