Topics should merit their metadata

By | 2011/04/17

Tom Johnson (I’d Rather be Writing) provides an excellent summary of Weinberger’s Everything as Miscellaneous. He says “I have never read a more relevant book for technical communicators than Weinberger’s Everything Is Miscellaneous.” I wholly agree. Weinberger’s book is the seed of my own ideas on Every Page is Page One.

Johnson identifies Weinberger’s central thesis, that the secret to making information findable in a world in which everything is miscellaneous is not organization, but metadata. And he correctly identifies the central problem that this poses: “Exactly how do you add metadata to your help topics? What kind of metadata do you include? And how do allow the user to arrange or call the topics based on the metadata they want to sort by?”

Metadata is a problem. It is not a problem to add metadata, per se. But it is a problem to add good metadata. Good metadata must be true. And adding true metadata to content is hard because not all content can wear true metadata well.

What do I mean by true metadata? Lets begin with an important distinction. Metadata can be intrinsic or extrinsic. Consider the metadata on a carton of maple walnut ice cream. “Contains nuts” is intrinsic metadata. “Mark’s favorite flavor” is extrinsic metadata — it is actually metadata about me, not about the ice cream.

“Contains nuts” is a piece of metadata that is potentially useful to everyone. “Mark’s favorite flavor” is potentially of use only to people what want to buy me ice cream. For this purpose, it is clearly far more useful if the metadata field Favorite ice cream=”maple walnut” is attached to me, than if the metadata field Persons who’s favorite flavor this is=”Mark Baker” is attached to the ice cream. From a content management point of view, also, it should be clear that attaching the metadata to me is much easier to manage than attaching the name of every person who favors maple walnut to the ice cream.

The problem is, you can’t arbitrarily attach true intrinsic metadata to an object, you can only attach intrinsic metadata to an object if that metadata is both true and intrinsic to the object. You cannot attach more intrinsic metadata to an object than that object’s intrinsic properties will allow. In other words, the object has to merit its metadata.

A large part of meriting metadata has to do with being the right size. Attaching the metadata “contains nuts” to the entire ice cream shop is technically correct, but you might have to eat a lot of ice cream before you get any nuts. At the other end of the scale, you can attach the metadata “contains nuts” to the walnuts individually, but you might have to eat a lot of nuts before you get any ice cream. The most useful scale for meta data is not the store, nor the individual ingredient, but the serving size.

Not all the objects in a typical CMS merit much intrinsic metadata. If content had been aggressively decomposed into small fragments in order maximize reuse or optimize translation memory, those fragments will generally not merit much in the way of intrinsic metadata. In many cases, there may be nothing useful to be said about such a fragment other than the list the places where it is used — which is, of course, extrinsic metadata to the fragment. It is really a statement about the book that included the fragment.

So, if you want to attach metadata to your topics that will help your readers find and organize your content for themselves, you need to create topics that actually merit a range of useful metadata.This is what Every Page is Page One is all about — creating single serving helpings of information that merit rich metadata that makes them easy to find and satisfying when found.

Extrinsic metadata is essentially a map. If you have a system that runs on maps, it is running on extrinsic metadata. Intrinsic metadata that reveals the real properties of an object allows you to run effective queries against the content collection. If you have a system that runs on queries, it is running on intrinsic metadata.

The fact that maple walnut is Mark’s favorite flavor is of no use to Tom. Tom has to go and try the ice cream for himself. Systems that work on extrinsic metadata require users to constantly explore the repository laying down new maps in the form of new extrinsic metadata properties. This is why systems like DITA are so dependent on big-iron content management systems.

As to how to enable readers to organize information for themselves — supporting user tagging of content is an interesting problem, but the most important issue is really this: making topics that deserve to be found. Because the iron law of Google, despite countless efforts to defeat it, is that content that most deserves to be found gets found first. The first and most important step to creating useful metadata is to create content that merits its metadata.

Category: Content Strategy Structured writing Tags:

About Mark Baker

I am an aspiring novelist and former technical writer and content strategist. On the technical side, I am the author of Every Page is Page One: Topic-based Writing for Technical Communication and the Web and Structured Writing: Rhetoric and Process. I blog at and tweet as @mbakeranalecta.

2 thoughts on “Topics should merit their metadata

  1. Marcia

    Excellent topic, Mark, excellently written.

  2. Pingback: Parts and Provenance - Every Page is Page One

Leave a Reply