Structured Writing FOR the Web

By | 2013/05/17

Tom Johnson started the discussion with  Structured authoring versus the web. Sarah O’Keefe and Alan Pringle took it up in Structured authoring AND the Web. My turn: Structured authoring FOR the Web.

One of my long term grievances is that structured authoring has been adopted piecemeal. Rather than approaching it holistically as a method that can provide a wide range of quality and efficiency benefits to the authoring process, people have tended to adopt it for a single purpose, and to use it only to the extent that it achieved that singular purpose.

The first such single-purpose structured authoring movement was focused on single sourcing — the ability to write content once and to deliver it to multiple formats, such as PDF and CHM help.

The second one was reuse — the ability to use a piece of text in more than one publication without having to write it and maintain it in more than one place. Reuse is now the primary aim of, and the primary reason for adopting, DITA.

Most of the structured authoring systems in use to day are designed to do just these two things: single sourcing and content reuse. It is little wonder, then, that when Tom Johnson looked at structured authoring, he did not see it as a good fit for the Web. Neither single sourcing nor reuse is anything like so compelling on the Web as it is in the world of paper/PDF manuals and local help systems. And structured help systems that focus on these two functions don’t tend to do anything to address the problems that are more pressing on the Web.

Single sourcing is less of an issue on the Web, because most content written for the Web is never presented in any other format. The need for single sourcing comes mostly from content that is written first and foremost to be a book and a local help system. It may then also be translated into HTML and posted on the Web, but it is not Web-like content in its form or its assumptions. It is not Every Page is Page One content, and if you land on one page from a Google search, there is a good chance you won’t find it helpful, or even comprehensible as a stand-alone page.

Sarah O’Keefe and Alan Pringle point out that many people in technical communication still need to present their content in PDF and help formats as well as on the Web. From what I can see, though, almost nobody is writing primarily for the Web and then producing PDF and help as secondary outputs. People who have moved their documentation to the Web as their primary media tend not to produce any other format.  So again, if your focus is the Web, you have little interest in single sourcing.

(By the way, I think it is a mistake for companies to hold off making the Web their primary media until they are willing to completely abandon PDF and help. The Web is too important a media to be treated as secondary today. So I think people should be interested in systems that are Web oriented and also do single sourcing to PDF and help. I just don’t see much evidence that they are. If you are doing it, I’d love to hear about it in the comments.)

A big part of the demand for reuse arises when you create a set of discrete publications and you want the same content to appear in all or several of them. When your focus is the Web, however, you are not creating a set of discrete publications, but one website. As Tom Johnson argues here:, you don’t want to duplicate pages on your website.

When you want the reader to have access to a common piece of information, you put it on one page and link to it from other pages. In many ways, linking is to the Web what reuse is to paper/help.

This does not exhaust all the applications of reuse by any means, but it means that reuse is a less compelling feature for the production of Web-oriented content.

The Web has other challenges, some of which include:

  • Continuous evolution: while a book or help system is published infrequently and all at once, new topics are being added to a website all the time, old ones are being removed, and others are being edited. A website is never in a fixed state. Is is always in flux.
  • Linking: the Web is a hypertext medium. Pages are related to each other through links. Rich links make for both a better reader experience and better SEO. But finding things to link to is time consuming, and maintaining links in a site that is constantly in flux can be exhausting. The result is many sites do not have as many links, or as effective links, as they should have.
  • Organization: a website needs organization, but because it is in constant flux, that organization has to be dynamic. The methods of content organization from the static book/help world just don’t work effectively on the Web. What we really want is a plug-and-play Web in which the site organizes itself so that simply adding a page to the Web results in it being fully integrated into the sites organization and linking system.
  • Collaboration: while collaboration certainly exists in the book/help world, it is almost universal in the Web world. The number of contributors tends to be larger and more diverse, and the need for immediacy means that it is difficult to pass every contribution through a complex editorial process. What we need is a way to provide as much guidance and feedback automatically to the distributed writing community, and to remove any need for them to understand how the CMS works or how the site is supposed to be organized (which is where plug and play is so desirable).

Structured authoring can provide elegant and powerful solutions to all of these problems. Unfortunately, existing structured authoring systems that were designed only for single sourcing or reuse often do not contain the kinds of structure necessary to address this class of problems.

In particular, the structures that these systems support are often simply publication semantics. The separate content from formatting by replacing formatting markup with abstract publication semantics like <title> and <emphasis>, and they separate content into chunks using a few generic topic types that have more to do with the shape of the content than its subject matter. (See The Tyranny of the Terrible Troika: Rethinking Concept, Task, and Reference.)

To address the Web issues described above, you need to have a structure that focuses more on the capturing and expressing the subject matter of the content. Some existing structured writing solutions do address subject matter to some degree, but usually with an external classification system. Such systems are useful, but they don’t enable you to do automated linking using the technique described in More Links, Less Time.

When it comes to enabling plug and play content management, external classification schemes can help, but they don’t always ensure that the content does the job it is supposed to do in the way it was supposed to do it. They can identify the subject that the topic discusses, but they can’t validate that the topic is true to its type and that it covers its topic completely and consistently and according to the agreed form. For that you need a structure that is specific to the particular subject matter, whether it be a recipe or a car review or a tourist guide to a small town. As R. Stephen Gracey says (as  quoted by Sara Wachter-Boettcher, Content Everywhere: Strategy and Structure for Future-Ready Content)

Your goal cannot be to have as few content types as possible, with as few fields as possible. A system with only one content type is like a site with only one navigation link, labeled “Everything.” Once you’ve accepted that a good system has “just the right number” of content types, the way is clear to be expansive in discovering how many types there might be.

For collaboration, we need to create authoring environments that are both inviting and easy to use. That does not mean a blank screen. That requires the author to do all the work to define the structure and requirements of the topic before they write it. Professional writers may feel they need to work that way, but everybody else will welcome as much guidance as we can give them. Thanks to the forms-based authoring mode in oXygen XML, we can now create elegant and useful forms that almost anybody in the company can use to create reliable content that it true to its type and complete. Creating this helpful and easy environment is key to successful collaboration, and, far from being an impediment, structured writing, paired with the right interface, is the key to getting it. As Sara Wachter-Boettcher comments,

The primary users of content management systems are their authors, yet this fact is routinely forgotten— as you can see in any of the many dreadful CMS user experiences out there. We are used to working with beautiful interfaces like those provided by Google, Mailchimp, and Wufoo. They’re what we all know and love. They’re entry level. So why should our authors be expected to work with substandard tools to create content? Author experience focuses upon these challenges. How can we enable authors? What can we do to make them more productive? How can we tailor the environment to make everything low effort? How can we delight authors? Because if a bad author experience leads to an even worse customer experience, we should do something to improve that. …we need smarter ways for content entry— ways that are low effort and elegant, not clumsy like so many WYSIWYG text editors. We need role-based authoring, where the author’s environment adapts to his specific content needs and rewards him for investing his time and effort in creating great content. Only then will an author embrace his CMS and become a loyal advocate— thus making him more productive and his content more valuable.

Structured authoring can provide all that.

Structured authoring techniques developed to solve specific individual problems in the book/help world do not address the major pain points of Web content development. That does not mean that structured authoring is not right for the Web, it only means that we need to apply different structured authoring techniques to support the development of Web content. And, in a perfect world, we would not make the mistake of only addressing certain immediate problems, but would at last take a holistic view of the benefits that structured writing brings to technical communications, regardless of what our primary delivery media may be.

10 thoughts on “Structured Writing FOR the Web

  1. Ellis Pratt

    There’s a lot of content in wikis that you might call body of knowledge content, and there’s still demand to offer that in pdf format. As evidence we can look at the number of downloads for the Scroll Pdf plug in – this enables you to create multi page pdfs from Confluence. It currently stands at 11,100 downloads. It’s a chargeable plugin btw.

    1. Mark Baker Post author

      Good point Ellis. Wikis do seem to the the one Web platform where there is also a demand for PDF output.

      What I always wonder about Wikis is, did people buy them because they wanted to produce Web content as their primary output or because they wanted the collaboration features.

      For whatever reason, Wikis do seem to have more traction in Tech Comm than standard WCMS.

  2. Pingback: Structured Authoring Versus the Web? | I'd Rather Be Writing

  3. Scot Marvin

    Boy, Tom set off a firestorm with his recent blog. I’m not sure there’s a definitive right answer. I only know what my readers need–or, at least, that’s my delusion.

    I write for sys admins, cloud users, and cloud admins. We’ve found that the sys admins and cloud admins like the PDFs. I’m sure it’s because of the installation and configuration tasks that require off-time when the machine isn’t available to refer to digital doc on the web.

    It seems to me that end users of an app or product like web docs better. They’re usually not reading a lot or sequentially.

    We generate webhelp and PDFs from XML. I’m not sure what the deal is with structured vs. web. I don’t see these deliverables as being incongruous at this point. Maybe because we don’t use a CMS. Or maybe because I’m just not smart enough to see the nuances in the debate here.

    1. Debbie M.

      My experiences are similar to yours, Scot. I think when you have hardware, an installation, or both involved, folks prefer paper (or electronic implementations of paper like PDF or EPub).

      The Web does not seem to me to be the best delivery mechanism for a project like building a data center from scratch. I guess it goes without saying that this is not “every page is page one” scenario.

  4. Neal Kaplan

    Similar to Scot’s experience, when I was managing a doc wiki most of the requests for PDF came from internal users who wanted an offline version (while traveling), or from the sales team who wanted to present some documentation to prospects. The latter request disappeared when we were able to open the doc site so it was no longer locked behind the product’s login requirements.

    This was also documentation for a SaaS product, so users wouldn’t be doing any work with it while offline.

  5. Alex Knappe

    Essentially, Mark picks up a topic here, that I’m going to use for am essay for my certification. Single sourcing – especially in context of web vs. paper vs. mobile output simply doesn’t work in most cases. The different media is always asking for different source content. Linking doesn’t work in paper output. Large topics won’t work very well on mobile devices.
    But essentially structured authoring is key to a workable solution.
    Function design is one of the attempts to generalize information in a way, that separates content from design completely, but in most cases the reality of function design is a nearly incomprehensible conglomerate of myriads of XML elements.
    Actually I don’t think you have a truly workable solution to deliver your information in the same quality for each output, yet. And every attempt to get everything covered may ultimately fail.
    At the time being, all we can do is to decide for one or two main output media and write for them specifically.

  6. Reuben Barnett

    You don’t have to implement structured authoring all at once. Author-it has a business logic layer that can be applied as appropriate to legacy or new content.

  7. Pingback: Poll: Can Structured Authoring and Web Content Delivery Co-exist?

  8. Pingback: Structured Authoring By For And Or Nor With In the Web | I'd Rather Be Writing

Leave a Reply