When I started my career in tech writing, it was the age of the writer. Tech writers tended to work independently on a single book for months at a time. Better, for many, they not only got to write the book, they got to design it and shepherd it through the publication process. At the end of the process a book arrived from the printer and you got to keep a copy — I still have several. It was, from beginning to end, your work, your product, your book.
Fewer of us get to work that way today. We now live in the age of the content manager. Writers contribute chunks of text to content management systems that spit them out in various combinations. There is no end-to-end ownership. Not everyone works like this, but it is an increasingly dominant model, and the model which just about every pundit in the industry is urging companies to adopt.
It is not just in the workplace that we see content management predominate. Back in the day, tech comm conferences were dominated by discussions of writerly subjects. They were about writing. They were about rhetoric. They were about learning. Today, they are dominated by content management. They are about findability. They are about taxonomy. They are about reuse. They are about databases and CMSs.
Writerly virtues, alas, often take a back seat to content management virtues. Improved percentage of reuse becomes the goal to strive for. These are the things we are judged on in the age of the content manager.
I bear a portion of the credit/blame for this situation. My 1995 paper on Component Based Information Development (Proceedings of SGML 95) called for a move away from managing documents to managing components of documents. Maybe I just like to blow against the wind, but as hard as I campaigned for component content management in the age of the writer, I now find myself increasingly campaigning for writerly virtues in the age of the content manager.
Make no mistake. We need to manage content. We don’t necessarily need to make percentage of reuse our primary metric, but we do need to collaborate on the production and management of content. I would never advocate for a return to writing books in cubicles. But I do feel that writers have lost out in the way content management is practiced today, and that content has suffered in the process.
What I had always hoped for is that structured writing and content management would become part of the writer’s toolkit and the writer’s mindset — that we would achieve a content management driven by writer’s values and writer’s appreciation of the subtlety and even the beauty of communication. Instead what we have is a content management driven by a database mindset, a mindset of rows and columns, a mindset of taxonomies where words are lined up like soldiers on parade. It is an orderly world, and internally it is a very efficient world. Sometimes it improves the actual efficiency of the content process, by whatever measurement of efficiency matters to the business.
Sometimes it does not. Content management projects often fail to deliver the hoped for productivity. Sometimes the messiness of the content refuses to submit to the orderliness of the content management system. The CMS mindset blames this on the writer for failing to fit their work to the prescribed structures. The writerly mindset blames the CMS structures for failing to represent the real messiness and complexity of the world they are trying to write about. Both may have a point. The larger point is that, 20 years after my Component Content Management paper, writers and content managers are still strange bedfellows.
Part of the reason that content management did not become part of the typical writer’s own toolkit was that the technology was simply too hard. SGML and RDBMS (the tools of choice in 1995) have given way to XML and CCMS, but they are still highly complex. There are nicer interfaces now, but they don’t hide the need to understand the underlying structures and the algorithms that process them in order to build a structured writing environment. Writers were able to learn to set up Frame’s Complex stylesheets and to use its cross reference mechanisms and TOC and index generation features because they were conceptually simple enough and fit within a domain of knowledge that writers understood. Structured writing and content management are far more abstract and require algorithmic thinking.
Thus the tools that writers use are designed and built largely by people with database and programming backgrounds. And while there are a few of us with a foot in both worlds, it does not seems like writers and content managers are much closer to understanding each other’s mindset than they were in 1995.
A big part of the problem lies in the diversity of structure that we find across the content spectrum. Nowhere is this more evident than in technical communication. I think we can usefully divide technical communication content into four categories (other forms of content may fit them as well):
- Labeled data: This is a collection of data points with labels attached to them. This category includes tables with labeled rows and columns, lists (such as parts lists, or spec sheets) and hierarchical lists. Labeled data needs stories to explain what data means. Every label in labeled data is a reference to a story that explains the data.
- Narrated data: The same information — data points — as with labeled data, except that rather than being presented as labeled fields, it is written as sentences. Sports stories and annual reports are full of narrated data. Many Wikipedia articles contain a data box beside the main text which contains the same data points in labeled data format as the text contains in narrated data format.
- Structured stories: Stories are different from data points. Stories build worlds by appealing to stories that the reader already knows. They are always subject to a degree of misunderstanding because different readers understand their allusions to other stories in different ways. Structured stories are stories that follow a fixed or shared pattern for a particular type of subject. Recipes are structured stories.
- Unstructured stories: Unstructured stories are stories that do not follow a fixed or shared pattern. While many unstructured stories could be structured, the structure of structured stories, like the labels of labeled data, is a reference back to a story, and if you follow that chain of references back, you will necessarily reach unstructured stories. Structure, in other words, is itself a story: an unstructured story that establishes a structure for telling other — structured — stories.
I have noted before that the DIKW pyramid, which makes data the base, is an inversion of the truth. Stories are the base, and data rests on a pyramid of stories, without which it would simply be noise. Yet the DIKW pyramid is an idea that often comes up in content management circles. In many ways, it expresses the content manager’s view of the world: capture the fundamental data in an orderly system in which every word means one thing and one thing only, and build everything else from that.
Systems built on that premise can solve real world business problems and save companies lots of money. But these successes are not consistently reproducible. The whole content universe cannot be reproduced from data, because this system is an inversion of the reality that stories, not data, are the base on which everything else is built.
The logic of the content management world view is that the model should be extensible across the enterprise. If stories can be created from data, then stories of every kind, to meet every need, can be created from data across the whole spectrum of content.
(I’m not suggesting that every content manager holds this view. I’m painting the contrast between the content management and the writerly view in deliberately stark terms to demonstrate the tension between them. There is a tug-of-war between these views and many people in the industry, like myself, have pulled on both sides of the rope at different times. This sort of tension is inevitable when you try to reconcile two different approaches because you want the benefits of both.)
The logic of the writerly world view suggests something very different. If stories are the foundation, and if data is precariously balanced on a pyramid of stories, doomed to become noise, or at least to be misinterpreted, by anyone who does not know the exact set of stories that explains it, then the idea of data as the solid foundation on which all content rests seems much less supportable.
Does this mean that we should go back to writing books in cubicles? Not at all. The economy of language relies on our presenting information in a way that can be used efficiently by people who know the stories on which it depends. A content world consisting entirely of unstructured stories would communicate with dreadful inefficiency. We need to produce structured stories, narrated data, and labeled data. We probably need far more of these than we currently produce. Structured writing and content management techniques are often useful tools for creating and managing them.
However, the part of this spectrum that content management has the hardest time with is actually structured stories. Content management can deal with unstructured stories by storing them as a blob and attaching a metadata record to them. This is inadequate in many ways, from the writerly point of view, and from the point of view of the foraging reader, but it meets the needs of content management to create manageable objects. The unstructured story sits inside a container with a labeled-data metadata record attached to it, and the content management system interacts with the metadata record. It has no need to involve itself at all in the authoring of the unstructured story.
With structured stories, however, things are different. The structure of a structured story is actually labeled data, but there is a story relationship between the data fields. The structure of structured stories is not generic. It is highly particular to the intersection of the subject matter and the task. Thus a recipe is highly specific to the intersection of food and the task of cooking. Current content management practice tries to reduce the specificity of structured stories down to a few specific types. Task, concept, and reference are the most common types. But structured stories come in hundreds of different types.
The content manager can choose to ignore the structure of structured stories and treat them like unstructured stories. But that leaves the author without the tools to create and manage structured stories efficiently and reliably. It also makes content management more complex and less reliable by reproducing the metadata in an external label that already exists in the structure of a structured story. (It is an iron law of the universe that if you have metadata in two different places, they will be inconsistent with each other.)
But there is another aspect of stories that needs to be managed: the relationship between stories. As we noted, stories tell stories by referring to other stories — stories that the reader is presumed to know, but often does not, or does not know well enough, or does not know by the same name. Placing stories into categories and locating them in hierarchies does nothing to express this vital story-to-story relationship.
But in the process, we need to remember that these more structured forms are dependent on a complex base of stories and of the connections and relationships between stories. The full complexity and subtlety of these relationships is beyond what any formal data structure can express (the base is always unstructured stories). But there is one form of organization that can come closer than any other to modeling those relationships and making them navigable, and that is hypertext.
Hypertext is definitely part of the world of stories. But it is not the old world of books. It is not even the old world of storytelling. It is a new world of story sharing. Because it is story-based, hypertext is less precise than a database table or a metadata record, but its mechanisms are also capable of handling labeled data. More importantly, hypertext is capable of seamlessly expressing the relationships between labeled data, narrated data, structured stories, and unstructured stories, which in turn makes the more structured elements at the top of their precarious pyramid more accessible and more understandable to the reader.
I have argued before that hypertext and content management do not currently see eye to eye. Hypertext models the ad hoc and imprecise relationships between stories that content management does not know how to deal with. But I also believe that we can learn to see hypertext as a different approach to content management — a form much more in tune with a writerly world view, a view in which communication is fundamentally about stories and the relationship between stories.