Introducing the SPFE Architecture

By | 2012/02/05

Today, I am announcing the launch of a new website, [EDIT: Information on SPFE has moved to GitHub. See] The SPFE architecture is a design for building structured authoring systems. Why would the world, need such a thing when it already has DITA? Why have I spent the last 15 years or so working on what I now call SPFE?

I’ve been involved with structured writing and database publishing for pretty much my whole tech writing career. Actually, I can trace my interest back to university days. My parents were in England for my father’s sabbatical, and I was visiting for the summer. My brother was a baseball fan and was frustrated because while the Times did publish the baseball scores, it never published a league table. Very much a 20th century problem, of course, but this was before the web, when paper was still the principal medium of information. My brother had a Commodore VIC 20, so we spent the summer teaching ourselves VIC-20 BASIC and writing a program that would let us enter the scores from the paper each day and construct a league table. It was the first real program I ever helped to write, and it was, essentially, a database publishing program. It took data, in the form of game scores and calculated and presented a league table based on that data.

My first attempt to apply the database publishing principle to technical communication came when I was at Nortel documenting the NorStar phone system. Nortel was bringing out a new system, a small scale precursor to today’s cell phones, with Star-Trek-style handsets in place of desk phones. At a pinch, the coverage would reach as far as the parking lot, but it was very cool at the time, and naturally it would be premium priced. All the switching components of the system were actually NorStar components with a different logo stamped on the plastic. The docs had to look and feel different, but 80% of the content was actually the same. An easy enough single sourcing problem today, but there were no single sourcing tools then, and the company hired a brand new documentation team, of the same size as the existing NorStar team, to prepare new docs for the new product.

The waste was obvious, but the tools to address it were non-existent, so we set out to invent single-sourcing for ourselves. I ended up with the grand title of Manager of Information Engineering Methods. Part of the consulting team that I hired to help us were some folk from a pioneering SGML company named OmniMark Technologies, where I ended up working for six years. There I created and managed the Tech Pubs department and we spent the whole of that time inventing and building SGML-based structured authoring, database publishing, and content management systems, adapting the system design as we evolved our information architecture. This work drew deeply on the professional services that the company was doing, in which they were coming up with original solutions to what were then very new problems.

While at OmniMark I got to have an early look at the development of DITA while it was still an internal project at IBM. (IBM used OmniMark in their SGML publishing chain.) There was no indication then of the importance DITA was going to assume in the technical communication world. There were efforts underway at that time to define common industry-specific DTDs and similar standards, but all working structured writing systems in those days were essentially designed from scratch. These came in two varieties: homegrown implementations of generic publishing capability, and highly specific systems that performed customer-specific content manipulations which, in some cases, accounted for millions of dollars in cost savings.

The homegrown implementations of generic publishing capability were created before there were mature public alternative. Most have have probably been replaced by things like DocBook or DITA, and rightly so. There is no value in maintaining a home grown implementation of publicly available capability. If you are going to implement a local system, it has to be able to provide value that you can’t get from publicly available systems. And, of course, every company develops and maintains a number of local systems, for various business functions, precisely because they can’t get the same value from generic public applications. Recognizing this, the software industry produces high-level application development platforms and toolkits — a generic base on which many local applications can be built more swiftly and cheaply. These platforms and tool kits provide a necessary middle ground between using totally generic applications that don’t fit local needs, and the time-consuming and expensive process of developing an application from scratch. DITA provides one such toolkit, SPFE will provide another.

When we were first introduced to the DITA project, there was already a debate within OmniMark Technologies about whether it would be possible to create an architectural paradigm or toolkit that would allow for the more rapid and inexpensive construction of customized structured writing systems. We recognized that this would be necessary if structured writing was going to spread to small and medium sized tech docs organizations. The military, aviation, marine, and massive content aggregation projects where SGML was first used could afford to spend thousands to save millions, but smaller organizations could not. But we did not see in DITA the database-publishing approach that was at the heart of the really successful and productive projects of that time. In particular, DITA lacked (as it still lacks) the soft linking capability that I described in More Links, Less Time, which is something we considered an essential capability of any system.

As often happens to pioneering companies, OmniMark Technologies, missed a technology window. It bet against XML, believing that SGML was superior technology (which it is, but XML is cheaper and easier to understand). The company ended up being sold to Stilo International, which still produces the OmniMark language today. I moved on to create my own company, Analecta Communications Inc., but I continued to work on figuring out an architecture that would make it easy and inexpensive to implement a customized structured writing system using database publishing principles. Over the years I have continued to develop it, and have proved it out in working installations with multiple writers.

Meanwhile, of course, DITA grew larger and larger, and captured the imagination of some of the leading lights of technical communications. It has saved some organizations a lot of money, mostly through reuse and an attendant savings on translation costs. But it also displays the high degree of content management overhead that you get when you try to manage a large number of objects without imposing sound database principles in the design. There are enough issues with DITA, and enough important problems to which it offers no compelling solution, that there is still room for an alternative structured writing architecture. And it is not healthy for the industry, or for DITA, that there should be no alternate platform available. Choosing between different models, optimized for different things, leads to a deeper understanding of whichever system is chosen.

What DITA’s success makes clear, however, is that to be successful, a structured writing architecture needs to have a name, to be well defined, and have a tool kit that people can use to get started with. That is what is for. [NOTE: is now closed. The SPFE project is dormant. You can find it here:] It gives the architecture a name: SPFE, and it gives it a high level definition and description. A tool kit is in the works. also has a blog, which is where some of the more technical topics I have been discussing on Every Page is Page One will migrate to. Every Page is Page One is a blog about information design. The blog will be a blog about authoring system design and implementation.

Why launch before there is a toolkit ready for people to experiment with? Partly because I am involved in a number of discussion in different forums and groups in which it will be much easier to talk about certain ideas if I have the information on SPFE to point to. Partly to light a fire under myself to get the first version of the toolkit out. Once the first is out, I will be looking for anyone interested in contributing to it, as well as anyone interested in using it.

I will also be giving a talk on the SPFE architecture at the CMS/DITA North America Conference in April.

One thought on “Introducing the SPFE Architecture

  1. VS

    “And it is not healthy for the industry, or for DITA, that there should be no alternate platform available. ” +5


Leave a Reply