« Windows command line text processing with Javascript | Main | My favorite bookmarklets »

In single-source publishing, what do you call the source?

The editorial XML?

single-source publishing diagram

The idea of single source publishing is at least as old as SGML. You store one version of your content with all the information necessary to create other the versions (typically, a print version plus the electronic formats du jour), and then you develop automated routines to create those other versions from the central, "single" source. The central content gets updated as necessary, and you create new publications by running the appropriate routines to generate the other formats. By making changes in one place and generating the other versions with automated routines, you avoid the mistakes that result from trying to make the same change in multiple places. It's a lot like creating different versions of a software product from a base set of source code, and many of the same tools are often used.

Output formats can include various kinds of XML, such as XHTML, XSL-FO, or homegrown XML formats. To describe the central format I've heard the term "editorial XML" used, because it's XML, but it plays a different role from the output XML formats: it's the version that the editors maintain. Its structure is governed by the editorial DTDs or schemas.

A homegrown XML output format is sometimes called the "delivery" XML. In his XML 2006 presentation Case Study: Managing XML for a Global Content Delivery Platform, my former LexisNexis colleague Marc Basch described how different business units within the company had their own editorial XML formats and converted these to a particular delivery format that they developed for the central platform that would aggregate them.

I've heard this editorial/delivery distinction between sets of XML content elsewhere, but a Google search on the term doesn't find much besides Marc's paper. Has anyone else heard of these terms being used in a production system?

Comments

(Note: I usually close comments for an entry a few weeks after posting it to avoid comment spam.)

No Bob, the editorial version is the M$ Word version!

I use master source for the XML master, however generated.

I've noticed the master version is traditionally the original FrameMaker/Word version as well. It is not ideal, but with tools like WebWorks (which uses an XML intermediary format for processing), creating an XML based workflow while keeping the author tools seems possible.

I think the immediate future will be focused on an author focused master with an XML based master that acts almost as an index. Again, it would be nice to just have one resource as a master, but I think the current technical documentation landscape doesn't quite support it.

I really do think that using tools like WebWorks can make single sourcing practical from a technical standpoint. Essentially it allows a gateway for tools to add to a structured data store. I know it has been considered primarily a help tool in the past, but for those folks who deal with XML, it really is much closer to a document processing XProc-ish tool built around XSLT.

I should also mention that I used to work for WebWorks, so I'm a hair biased. From a technical standpoint, it is a very powerful tool that I believe many XML developers should consider taking a look at for document processing projects.

Bob,

The editorial XML version is the color-coded ODT XML version :-)

I use the term "normative copy" rather than master/editorial myself. I.e. it it the stream of bytes that must be obeyed and, in the event of a dispute about correctness of content or presentation, acceded to.

Sean

In the publishing industry, I have definitely heard the term “delivery XML” used commonly to refer to various XML outputs (mostly for web or third party aggregators). However, I have not really heard a common term for the source XML. And not sure if I've ever heard people in that industry refer to it as "editorial XML" (and that may because many in that industry are still getting XML after or as a part of print production processes).