In a comment on my Content Wrangler article, It’s Time to Start Separating Content from Behavior, Laura Creekmore said (emphasis mine):
[T]his conversation has brought to mind some thoughts I’ve had recently, and I think this is an even more difficult issue. Because eventually, we’re going to come up with all the technological fixes we need to resolve the issues above. However, right now, content management systems have already outstripped the technical interests and abilities of the majority of content creators and subject matter experts with whom I work. [And no, I’m not slamming my clients here. They are really smart people.] When we require advanced technical knowledge in addition to advanced subject knowledge in order to fully take advantage of the capabilities of our content systems, we’re not going to get the results we want. We have to NOT ONLY figure out how to do this, but we ALSO have to figure out how to make it easy and intuitive. I will say, I also don’t mean to slam these efforts — they are critical steps, and this is essential thinking. I’m just saying, please, let’s not stop this effort once we’ve made something POSSIBLE [as we have done with so many current CMS]. Let’s not stop until we’ve made it a reality for all content creators.
That is indeed the besetting sin of most structured authoring systems. They “require advanced technical knowledge in addition to advanced subject knowledge in order to fully take advantage of the capabilities” and as a result we are indeed not getting the results we want.
The cause of this is a very basic mistake that the designers of every structured writing system, including DocBook and DITA — especially DITA — seem to make. They fail to treat the tagging language as an authoring user interface. Instead, of designing the tagging language from the authoring function forward, they design it from the publishing and content management function backward.
They create a language to express publishing, content management, or reuse concerns, and then expect writers to write directly into what is really an internal content management format. Putting a graphical face over the markup does nothing to change this. The graphical interface only hides the syntax of the XML. It does nothing to change the fact that authors are being asked to create what should be the internal semantics of the publishing system — semantics they generally neither care about nor understand.
Why should the author have to deal with things like conrefs and maprefs, which have nothing to do with the subject they are writing about? Why should the writer have to worry about the details of key binding precedence? (Yes, that’s a real thing: http://dita.xml.org/resource/dita-tc-faq-about-keys#Q1)
It is as if a database program were written in a way that required users to write directly into the tables, even to the point of creating and managing surrogate keys by hand (this exactly what you are doing when you type a topic ID into a DITA map). No database system works that way.
Database designers optimize the internal structure of the database for processing efficiency, but they don’t expect people doing data entry to know anything about those internal structures. For data entry they create interface forms that ask people for data in a way that makes sense to them, and which does not expect the user to understand anything about the underlying data structures.
Imagine if you had to understand the data dictionary of your bank’s accounts database before you could use the ATM. If you did, the ATM system simply would not exist. The ATM system is all about building a interface for the user that addresses their task in terms that are familiar to them and that hides all the underlying structures and relationships, and protects those structures and relationships from tampering or incompetence on the part of the ATM user.
But most structured authoring systems, and most content management systems, are not designed like the ATM system. As Laura says “content management systems have already outstripped the technical interests and abilities of the majority of content creators and subject matter experts”. But they have done so not because the task is inherently complex, but because they are all designed to require the users to write directly into the underlying data structures and relationships that govern content management and publishing.
It does not have to be this way. It is bad design, pure and simple. Yet it is routinely accepted by people who would vociferously and unequivocally condemn that kind of bad design if they saw it in the products they were documenting or in other products they use day to day.
The cure is not complex. Stop designing content management and structured writing systems from the publishing engine backwards. Start designing authoring tagging languages that are designed to elicit content from the author in the terms that the author understands, and which require the author to know absolutely nothing about how the content is processed or managed.
A fundamental design goal of any structured authoring system should be that the author should not need any knowledge of the content management and publishing system other than a basic understanding of the very simple subject-oriented markup they are being asked to apply, and that that markup should not ask them for any information that they do not already have as part of their knowledge of the subject matter they are writing about. (This is one of the key design goals of the SPFE architecture.)
One of the original benefits promised for structured writing was that it would relieve the writer of concerns about, and responsibility for, the formatting and production of content. That has not really been achieved in every structured writing implementation today, but what is worse is that a whole new layer of content management responsibilities have been dumped on the writer’s shoulders which is frequently just as complex and time consuming as the old publishing responsibilities.
This, of course, limits the number of people who can author effectively, just as lack of knowledge of FrameMaker limited authorship in the past. For groups who need to extend the authoring franchise, the response has usually been to simplify the content structure so much that anyone can author in Word, or in something that looks as much as possible like Word. But this means, of course, that you are not getting structured content. You are not getting addressability and mutability and all the advantages those things bring.
A much better response would be to go the opposite route. Rather than taking away all structure, add more structure, but remove all traces of the publishing engine and its operations from the structure that authors create. Instead of giving authors a blank sheet and a palate of elements that function essentially as formatting commands, give them a structured form that spells out exactly what data to enter in each field.
That kind of highly precise, highly semantic structure will give you all the context you need for addressability and mutability, and to drive any publishing or content management process you need on the back end, without any of your authors having to know anything about how it works.
This is the only way forward. We are never going to get the whole world to correctly and consistently create the data structures of complex content management and publishing systems. The only way to have the benefits of structure and make the authoring of structured content something that is easy enough for most people to understand and perform accurately (something we never achieved with word processing and desktop publishing, we should note), is to remove all content management and publishing concerns from the content.
It can be done. The database world has been doing it successfully for decades. The ATM system that allows me to withdraw US funds from my Canadian bank account while standing in an airport concourse in San Francisco is a perfect testament to the fact that it can be done. We just have to start doing it.