XML is Not the Answer

XML is not the answer. Structured writing may be the answer. XML is one way to implement structured writing.

More and more these days, I am hearing technical publication managers (and not a few consultants) talk about the need to “move to XML”. This may be a shorthand way for them to describe a very sensible plan to implement structured writing, but if so, I wish they would say “move to structured writing”. Structured writing is a potential answer to many content challenges. XML, by itself, is not.

There is a very real danger in all this XML talk that people can get the impression that if they move their content to XML — any XML — then they will instantly have the ability to reuse content across the enterprise and to seamlessly and effortlessly deliver content to any and every channel in a form that perfectly matches the optimal information design for that channel. It just isn’t so. read more

Changing the “what” in WYSIWYG

The “what” in “what you see is what you get” is not always “formatting”.

Authors need WYSIWYG (What You See is What You Get) displays to work effectively. This is basic ergonomics. If you can’t see the thing you are supposed to be creating, you will have a hard time creating it well.

But while authors always need WYSIWYG displays, the “what” that they need to see is not the same thing in every case. There can be several different values for the “what” in WYSIWYG.

The term WYSIWYG was coined to describe the displays of word processing and desktop publishing programs. In this case, the “what” was the formatting of the printed document. (At the time we were talking strictly about printing on paper. The “what” you saw on screen was the “what” that was going to appear on paper — more or less.)

This was a big step forward, but the step forward was not enabling people to do typesetting with computers. Doing typesetting with computers had been possible for a long time using a language called TeX, which is still used in some places today. But TeX, has never been widely used because while you can specify formatting with great precision in TeX, you can’t see the formatting that you are specifying as you work. You can only see it afterward when you run the file through the TeX program to create formatted output.

Make a mistake in your TeX markup, and your output may be garbled in ways that make it hard to find your error, and finding and fixing errors can be fiddly and tedious. Desktop publishing was a big step forward because it allowed people to see the formatting they were creating as they created it.

But formatting is not always the “what” in WYSIWYG. In Learning Python, Mark Lutz describes the syntax of the of the Python programming language as WYSIWYG.

Python is a WYSIWYG language — what you see is what you get — because the way code looks is the way it runs, regardless of who coded it.

Lutz, Mark (2013-06-12). Learning Python (p. 326). O’Reilly Media. Kindle Edition.

What he means in this case is that with Python, unlike most other programming languages, the indentation of the source code always shows the logical structure of the program. If formatted poorly, the indentation of code in a C or Java program can be completely misleading as to the actual structure of the code. Lutz provides this example:

if (x)
    if (y) 
        statement1; 
else 
    statement2;

Here the formatting of the code makes it appear as if the else statement belong to the first if statement, but it doesn’t. It actually belongs to the second if statement. The structure you see is not the structure you get. In Python, such confusion cannot occur. Lutz’ Python example looks like this:

if x: 
    if y: 
        statement1 
else: 
    statement2

In this case, the else does belong to the first if, as it appears. If it were indented under the second if, like this:

if x: 
    if y: 
        statement1 
    else: 
        statement2

then the else would belong to the second if statement

In Python, the indentation of the source code determines the structure of the program, and thus the logical structure you see in the source is the actual structure you get in the program. Thus Lutz chooses the term WYSIWYG to describe the language. But in this usage of WYSIWYG, the “what” is not formatting, but structure.

In content engineering there are multiple possible “whats” you might want from a WYSIWYG display. Desktop publishing is based on the premise that the author is also the designer, the typesetter, and the prepress person, so the “what” is formatting. We can be more explicit about this by substituting the word “formatting” for the word “what” in the acronym: FYSIFYG (the Formatting You See Is the Formatting You Get). The desktop publishing approach requires a FYSIFYG display.

Before the days of desktop publishing, when authors were only required to worry about the text, and did not have to wear a dozen other hats, they got by with pens or with typewriters. Both these tools had a TYSITYG display: the Text You See Is the Text You Get.

Today, however, the increasing demand for malleable and repurposable content means neither TYSITYG not FYSIFYG will suffice. We need more than just the text, and something other than text plus formatting. We need structure. What we need to create structured content, therefore, is a SYSISUG display: the Structure You See Is the Structure You Get. Creating such a display, alas, has not been easy.

It might seem like a raw XML file is a SYSISUG display, because all the structure is there in the XML tags. Many XML editors that let you switch to a view of the raw XML will call this “the structure view”. The problem is, while the structure is there, it is not in the least easy to view. There is way too much syntactic noise in an XML file for the reader to easily discern the structure it expresses, as this chunk of the DocBook source for my book shows:

<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0" xml:id="chapter.linkrichly"> <title>EPPO Topics Link Richly</title> <indexterm><primary>Weinberger, David</primary><secondary>on linking</secondary> </indexterm> <blockquote> <attribution>David Weinberger, <citetitle pubwork="book">Too big to know</citetitle> <biblioref linkend="TooBigToKnow"/></attribution> <para><indexterm class="startofrange" xml:id="idx.linking-1"><primary>linking</primary></indexterm>Links are the visible manifestation of the author giving up any claim to completeness or even sufficiency; links invite the reader to browse the network in which the work is enmeshed, an acknowledgement that thinking is something that we do together.</para> </blockquote> read more

Why is writing the only profession untouched by its tools?

Why is writing the only profession untouched by its tools? Larry Kunz strikes a familiar note in his recent blog post, Tools come and go. I’m still a writer.

I’m a writer. Once I used a typewriter. Now I use XML editors. If I stay at this long enough, other tools will come and I’ll learn to embrace them.

My old typewriter is gone. But I’m the same writer I’ve always been.

The same refrain is sounded over and over again wherever writers gather. It seems almost a badge of honor among writers to proclaim that your work and the essence of what you do is unaffected by the tools you use. read more

Structured Writing FOR the Web

Tom Johnson started the discussion with  Structured authoring versus the web. Sarah O’Keefe and Alan Pringle took it up in Structured authoring AND the Web. My turn: Structured authoring FOR the Web.

One of my long term grievances is that structured authoring has been adopted piecemeal. Rather than approaching it holistically as a method that can provide a wide range of quality and efficiency benefits to the authoring process, people have tended to adopt it for a single purpose, and to use it only to the extent that it achieved that singular purpose. read more

We Must Remove Publishing and Content Management Concerns from Authoring Systems

In a comment on my Content Wrangler article, It’s Time to Start Separating Content from BehaviorLaura Creekmore said (emphasis mine):

[T]his conversation has brought to mind some thoughts I’ve had recently, and I think this is an even more difficult issue. Because eventually, we’re going to come up with all the technological fixes we need to resolve the issues above. However, right now, content management systems have already outstripped the technical interests and abilities of the majority of content creators and subject matter experts with whom I work. [And no, I’m not slamming my clients here. :) They are really smart people.] When we require advanced technical knowledge in addition to advanced subject knowledge in order to fully take advantage of the capabilities of our content systems, we’re not going to get the results we want. We have to NOT ONLY figure out how to do this, but we ALSO have to figure out how to make it easy and intuitive. I will say, I also don’t mean to slam these efforts — they are critical steps, and this is essential thinking. I’m just saying, please, let’s not stop this effort once we’ve made something POSSIBLE [as we have done with so many current CMS]. Let’s not stop until we’ve made it a reality for all content creators. read more

We Need a New Economic Model for Tech Writing Tools

Tom Johnson’s correspondent, Sam from Canada, asks if tool vendors are not more to blame for the slow pace of change in tech comm than tech writers themselves:

Hi Tom,

I’ve been enjoying your posts along with Mark Baker’s. You both have good points about technical writing trends. I could be totally wrong, but maybe it’s not the tech writers that are resisting change. Maybe it’s the companies making the tools/money that are resisting change.

I don’t think the problem is so much that the tool vendors are resisting change. Tool vendors need a certain amount of change in order to create a reason for people to buy upgrades. But vendors also need, and therefore support, changes that provide a viable economic model for creating and selling software. They won’t support a change if there is not a viable way for them to make money by supporting it. read more

The Design Implications of Tool Choices

Every documentation tool has a built in information design bias. When you choose a tool, be it FrameMaker, DITA, AuthorIt, a WIKI, or SPFE, you are implicitly choosing an approach to information design. If you don’t understand and accept the design implications of your tool choice, as many people do not, you are setting yourself up for expense, frustration, and disappointment.

Introducing the SPFE Architecture

Today, I am announcing the launch of a new website, SPFE.info. [EDIT: Information on SPFE has moved to GitHub. See http://mbakeranalecta.github.io/spfe-open-toolkit/] The SPFE architecture is a design for building structured authoring systems. Why would the world, need such a thing when it already has DITA? Why have I spent the last 15 years or so working on what I now call SPFE?

Structured Writing is not Desktop Publishing plus Angle Brackets

What constitutes a “real” XML editor? The question is perennial, but is made topical by Tom Aldous’ surprisingly shrill defense of FrameMaker as an XML editor. It is unusual for a market-leading company to indulge in myth-busting aimed at tiny competitors. It is an approach more common to the small and desperate. But if we look past the oddness of Adobe employing this tactic, we see that the question of whether FrameMaker is a real XML editor, as with almost all debates about what makes a “real” anything, is not a debate about the product’s features, but a debate about what “real” means in the context. read more