12 Responses to What kind of “easy” authoring are you looking for?

  1. Pawel Kowaluk 2014/11/12 at 08:23 #

    Very interesting, Mark. I have always been asked the question “Can we make this easier for [engineers, reviewers, marketing]?” but I never thought to ask “Easier how?” Define your criteria of easier before you decide to buy a solution which claims to make DITA easy. 🙂

    However, I still have trouble imagining something. How would we establish semantics for highly versatile content? Should we specialize


    dl> into all kinds of , , etc.? And if we do, how do we make sure users take the time to review our markup dictionary and select the right tag?

  2. Pawel Kowaluk 2014/11/12 at 08:25 #

    I wrote a bunch of tags in my comment, but they all got lost. 🙂 I meant to say: specialize definition list into all kinds of featureList, partList, portfolioItems, etc.

    • Mark Baker 2014/11/12 at 09:31 #

      Thanks for the comment, Pawel

      That is certainly part of it, but it needs to be more holistic than that if you want to create a concrete semantic authoring experience for the author.

      I don’t particularly recommend DITA for this kind of thing, because it is far more complex than it needs to be in almost every dimension. However, since so many people are committed to DITA systems and need to find a way to make them work, specialization might be one route to concrete semantic markup.

      An alternate route is to create the concrete semantics markup independently and translate it to DITA for reuse or publishing. However you approach it, though, the point is to maximize the amount of valid correct semantics you capture, not to use DITA for as many things as possible.

      What makes content versatile is not the current format it is in, but how much you know about what it says. Syntactic interoperability is worth very little compared to semantic interoperability. To achieve syntactic interoperability, you need to get the whole world to agree on syntax — which is not going to happen. To achieve semantic interoperability, you only need to ensure that you know what the semantics of your content are. If you you know that, you can translate to any syntax, and to any taxonomy, that anyone needs at a later date.

  3. Don Day 2014/11/12 at 10:07 #

    I suggest that you ease up on emotional superlatives (“[DITA is] far more complex than it needs to be in almost every dimension”). DITA requires only a title at a minimum, and from there on the author only need see what they need to see. Authoring schemas are supposed to do just that–provide only a subset of any larger architecture behind any authoring system.

    Pawel has provide a set of names of semantic data structures. They are effectively concrete data semantics–if they happen to map to an underlying data architecture, that should not matter. Ignore specialization; his data model snippet could drop into your one-off schemas or into DITA without a care for the context. Our problem is to represent the semantic data model to the user in a way that makes his or her interaction with it as natural as possible. In the Web world, it’s all about forms. In the world of assisted structured authoring tools, it’s all about how well the tool provisions the hoped for interface for easing the author’s experience.

    So in your careful separation of abstract from conceptual, I suggest that you keep your well-known disdain for DITA appropriately abstract as well. If we can define a common semantic data model for part list, then each of us as implementers can apply our domain expertise to mapping that requirement to an “evaluated as appropriate” editing solution, and then compare how even the authoring experience is between both–that ‘s all that really matters. Let’s get to that level and try to stay classy.

    And by the way, at this moment we should all be watching the news and glorying in the European Space Agency’s attempt to land a probe on a comet. I’m off watching that for the rest of the morning.

    • Mark Baker 2014/11/12 at 10:42 #

      Thanks for the comment, Don.

      You are quite correct that you can use DITA to create concrete semantic markup, and that from an author’s point of view that is all that matters.

      However, you and I both know that that is not how DITA is commonly used or implemented. To most people, DITA means the topic types that are blessed by the technical committee, which are complex and abstract — of necessity, given the breadth of uses they are designed to support.

      From an information architect’s point of view, of course, it matters a great deal how easy it is to define concrete semantic markup, and I submit that there are easier ways to do it than DITA. In the larger scheme of things, however, that is only one factor in an overall technology adoption decision.

      I wish I could find a more convincing way to express the idea that a semantic data model does not need to be common. The fact that it is semantic means that it is comprehensible, and therefore translatable, to any domain with compatible semantics. The reason to store data semantically is precisely to make it independent of any particular format. Because it is semantic, you don’t need a common model. To make it semantic, you need a correct model.

      The notion that you need a common model to exchange semantics is one of the most damaging ideas affecting content technologies today, because it forces us into abstractions and generalizations that add complexity and make it difficult for people to learn and use. The result of that difficulty is that the quality of the data is often poor. We end up with a common model, but not common semantics, dues to errors and variations in how the model was used.

      As for my “disdain” for DITA, I actually find a great deal to admire in DITA, and I have said so many times. DITA showed that if we could package a set of structured authoring ideas that have been around individually for years, we could make it much easier for people to grasp, accept, and implement those ideas. That was a great service to structured authoring, and thus to the content community generally. It is a technology that may of us are going to be using, myself included, for many years. But I will not apologize for calling DITA complex, a sentiment I am not alone in expressing. In fact, it is a mark of DITA’s success that it is now being used by a lot of people who don’t particularly like it.

      I would be delighted to keep the discussion classy, which, I think, has to begin with not accusing one another of a lack of class because of a disagreement about the merits of a particular technology.

  4. Don Day 2014/11/12 at 12:36 #

    Point accepted on disagreement not being a matter of class. With great class, I suspect you will eventually find your statement about commonality of data models may be hard to defend in environments where interchange without Nsquared-1 protocol converters is desired.

    An ideal data model for a particular domain may be different depending on its use. A model that maps directly to the way a SME thinks about knowledge may be different from a model that an engineer wants to use to analyze performance or that a programmer wants to automate a business process, and even for the publisher who needs to flatten or reorganize all of the previous models for presentation.

    Should these all be driven by the same common, one-off data model? I can certainly see that as necessary for some types of data. But data can be so darn messy! Designers often deal with a spectrum of specificity that creates tension between markup designs that are more specific (description of human biometrics, for example) or more general (HTML5 just for contrast).

    Your post is about how to make structured writing easier for authors, but there are other stakeholders in that design. Once we interview all the potential actors upon that data, each may require its own protocol converter for the use of that single source, as you suggest. On the other hand, the economics of the overall situation may make some generalization necessary. We can’t draw too hard a line here… the successful integrator will need to tread between principle and politics in proposing a design that pleases both authors and users of that content.

    And so I am very anxious for your webinar and subsequent posts where I understand we’ll get to explore some principles of modeling concrete semantics and integrating them with end-to-end tools.

    (And speaking of concrete semantics, I was amused to discover that there IS an XML schema that includes the semantics of concrete: “agcXML is a set of eXtensible Markup Language (XML) schemas designed to automate and streamline the exchange of information during the building design and construction process.” It may not be the Rosetta Stone of semantic interchange, but I suppose we could cast one using it.)

  5. Mark Baker 2014/11/12 at 13:39 #

    Hi Don,

    Yes, these are all significant difficulties. The real base problem with semantic interchange is that it is inherently very hard to do. As you note, the SME, the engineer, the programmer, and the publisher all have different concerns, even when dealing with the same subject matter, and therefore care about different semantics.

    Creating domain-specific semantics does not have to mean Nsquared-1 protocol converters, however. You can still have a common semantic data model that is used for many semantic exchange purposes.

    One of the difficulties with such models is that you have to choose between the very general and simple, which is easier to use, but less semantically specific, and the complex and specific, which is very hard to learn, and which tends to be used inconsistently, compromising the value of the semantics it models.

    If you allow each domain to build its own model, it can be both simple and semantically specific, which will make it more precise and easier to use, resulting in better data collection. We can then map this high quality data from the local domain into the global domain in a highly consistent way. That requires only n protocol converters. The quality of the data in the global domain will be much higher — making reuse and repurposing much easier and more valuable. And the authors in the individual domains will be happier because they don’t have to learn a semantics or a vocabulary from outside their domain.

    And this approach makes the decision about the global model much easier. If you are not asking any individual person to write in this model, you can make it much more expressive, precise, and complex.

    And when you need a level of semantic exchange between domain that the central model does not accommodate, you still have the option to build a more precise protocol converter between the domains that need to share at that level.

    There is nothing exotic about this model. You use it every time you go to the ATM or order a book from Amazon. It is how most things work in the technology world. But the content world has long been out of step — expecting authors to work directly in the global model.

    And, of course, DITA’s specialization mechanism was designed for this kind of model too. The problem, in my eyes, was that it was also designed to be usable without specialization, and that involved the creation of general mechanisms for things like linking and reuse that involve both abstraction and imperative markup that it is not immediately obvious how to specialize out.

    In other words, you can specialize a declarative element to a more precise declarative element, and presumable specialize an imperative element to a more precise imperative element, but how do you specialize from an imperative element to a declarative element? (Especially if you want to retain meaningful fallback processing?)

    Publishing is essentially a process of converting the declarative to the imperative. Most processes involve multiple step in which more of the declarative is turned into the more imperative. Standard tools, from DITA to FrameMaker to DocBook, ask the author to do some of that translation from declarative to imperative as they write, which makes the writing/publishing interface more flexible, but at the cost of asking authors to grapple with a set of abstractions that make the task conceptually difficult.

    What I am proposing is a more declarative markup, with little or no imperatives embedded in it. This necessarily means you will need an additional layer of software that converts that declarative markup into more imperative markup on its way downstream. (Which is something that the SPFE OT now does, by the way, because it now includes an DITA presentation layer — something I expect will be a part of a common use case.)

    • Alex Knappe 2014/11/13 at 09:22 #

      I’m completely with you on this one Mark,
      in the past years I’ve seen many structured implementations, most of them falling apart in terms of data quality.
      My personal Top 10 why things were done wrong are:
      1. It was possible to insert that element there.
      2. My layout needed this!
      3. There’s a special element for this?
      4. We always do it this way.
      5. This just did not make sense to me.
      6. Why would I insert the metadata here?
      7. I didn’t know what that element meant to be.
      8. That means what?
      9. I thought it was supposed to be there.
      10. I didn’t care.
      This pretty much sums it up, what is wrong with most structured systems.
      They’re unflexible (in sense of DTP), semantically diffuse, incomprehensive and lax.
      Originally I was one of the guys that loved the idea of removing all layout from content, but I had to figure out the hard way, that this ain’t working this simple.
      Layout is to quite some extent important carrier of context and information. Therefore it needs to be respected upon creation of structure.
      The same goes for strong semantics. They’re able to provide tons of metadata just by being created intelligently.
      Being unflexible creates creativity on side of the author. They will find a way to do what they want (not what was intended).
      Being lax only welcomes creativity (in terms of working unintended), lazyness and not being interested.
      To avoid all this you need acceptance on sides of the authors. This means a flexible (in terms of layout), comprehensive (in terms of structure making sense), semantically strong and restrictive structure.
      I personally dislike DITA a lot – but even in DITA this could be achieved, specializing the hell out of it.
      Making things easy means making things understandable and obvious. But ain’t that exactly what our job is?
      Shouldn’t it then be in our responsibility to set up structures for our colleagues/ authors that are understandable and obvious?
      I think so. And I also think this can only be achieved, if we work with domain specific structures, as our target audience changes from domain to domain.
      Generalistic structures simply won’t work out for this. Engineers and Artists don’t speak the same language when they talk about their own domain.
      So why should structures try to do this?

      • Mark Baker 2014/11/13 at 10:11 #

        Hi Alex,

        Yes, that is pretty much what goes wrong, and why is goes wrong. And also pretty much what I think is the only hope of getting it right.

  6. Nick Wright 2014/11/14 at 06:14 #

    In any documentation, I always look for ‘easy’ reading. Plain English, analogies to the real world, concrete rather than specific, one method rather than many to solve a specific user need. Instructions written with command verbs and good layout.

    Only then do I look at the way you put the information together and design it for the ease of the user.

    Nick Wright – Designer of StyleWriter – the plain English editor
    Editor Software
    Trial and demos at http://www.editorsoftware.com

  7. Mark Baker 2014/11/19 at 07:52 #

    Due to technical issues, we had to reschedule the webinar. It is now scheduled for December 12, 2014: https://www.brighttalk.com/webcast/9273/135129.


  1. 5 reasons why content development vendors have it wrong - 2014/11/14

    […] Mark Baker […]