The rant which follows was sparked by the following tweet from Gilbane Boston:
#Content quality comes first, then worry about metadata and other #SEO shananigans – @Robert_Rose #gilbane
Twitter is a context free medium, and a tweet from a conference presentation may well misrepresent the speaker’s intent, so I am very likely going to do Robert Rose an injustice here, but anyway: He is dead wrong. Metadata comes first, content afterwards.
If you needed to describe the difference between structured and unstructured writing in twenty words or less, you could hardly do better than this: Structured writing means metadata first, content afterwards; unstructured writing means content first, metadata afterwards.
Of course, Robert Rose probably wasn’t talking about structured writing specifically. That’s pretty evident from the fact that he dismisses metadata as part of “SEO shenanigans”. But he was talking about the relationship between content quality and metadata, and that is precisely what structured writing is about, and precisely why structured writing deals with metadata first and content afterwards.
There are a couple of ways to get quality content:
- hire geniuses
- implement structured content
Actually, the hiring geniuses approach is a bit iffy. Geniuses can be hit an miss, producing works of genius one day and complete drivel the next. Also, it can be very difficult to get geniuses to stick to the topic at hand. Genius tends to scorch a blazing trail into undiscovered countries, which is a wonderful thing in itself, but not necessarily what you need to get today’s work done on deadline.
If you want to produce quality content, therefore, and to produce it consistently and reliably, your best approach is to go with structured content. Structured writing does not necessarily mean working in XML (which is content as structured data). It can also mean writing to a well thought-out content template (which is content as structured exposition). But if you are working to a well thought-out content template, that template is metadata, and defining that metadata happens before the content that conforms to it is written. Metadata first; content second.
When you do move to XML, you add another dimension of structure to your content: you create the content as structured data that can be processed by a machine. This adds a new layer of metadata: the XML schema for each content type you produce. The schema is metadata, and the schema is created before the instances that conform to it. Once again, the metadata comes before the content.
The same things applies to other kinds of structured data. One of the most common means for gathering structured data is a form (something I touched on when I discussed how you can get structured data from the crowd). Forms make it possible to gather structured data from ordinary people with a high degree of reliability. Every field label on a form is a piece of metadata that tells the user what information to put in each field. The metadata on the form exists before the content is added to the form.
As I discussed a while ago, metadata exists on many levels. When you fill out a form, not only are you guided by the metadata labels on the form, you are also creating a metadata record. Virtually every form begins by asking you to identify yourself by providing your name, address, email, age, sex, language, etc. The exact set of identifying metadata varies with the application (and with the privacy legislation that governs it) but this is where every form begins: with identity, which is to say, with metadata.
The next step in most forms is to gather information on the product or service that the particular customer interaction applies to. This too is metadata. It is about identity, and metadata is basically about establishing identity. Only when all the identity information has been gathered — the identity of all the people, goods, and services involved — is the subject of the particular transaction addressed: the customer complaint, the tech support question, etc. Metadata first; then content.
There is a very good reason why this is standard practice for form design. Supplying all the metadata first focuses the writer/customer’s attention on the precise matter at hand, which improves how they then state their question, complaint, or request. It also avoids the need for them to restate the metadata in the unstructured text of their message, which they would invariably do inconsistently and incompletely if they did it without structure. It makes sure the the metadata that the company needs to process the form successfully is gathered before the writer/customer states their actual message, since it is highly likely that if they state their message up front, they will then feel that they have said everything that needs to be said, and will not be bothered to state again in the metadata fields all the things they have entered in an unstructured fashion in the message text.
Forms are a highly effective way for an organization to gather structured data from almost anyone. They work because they lead with the metadata, both in the sense that the form fields are labeled with metadata, and in that they call out every piece of identity information they need into separate fields, and gather that identity metadata before accepting the entry of the customer’s actual message. By putting the metadata first and the content second, they are able to gather high quality data with a high degree of consistency.
A well-designed structured writing system does the same thing for content. It begins by creating a schema that is highly specific to the kind of content that is being created, so that the XML elements defined by the schema act as robust and clear metadata labels for the content to be gathered. This means, of course, that the metadata is part of the content schema itself, not a separate label stuck on afterwards. It also means that just as a good form begins by gathering all the identity metadata first, a good schema should do the same. Collect all the metadata before you create the content. Not only does this focus the writer on the content to be created, it also ensures that the full metadata record is created completely and accurately as the first part of the task, not dashed off afterwards as an afterthought by a writer whose mind is already on their next task.
The key to creating quality content consistently, therefore, is to put metadata first and the content second.
The whole purpose of having a content strategy is to have a structure or outline of what content you plan to present in what context, whether it is for a blog, a website, or even a single article. I’ve never been able to complete a writing project until I, at least mentally, if not in a formal outline, developed a structure that enabled me to introduce a topic, cover it in a structured order, and conclude it appropriately. As a student, and later as a technical writer, once I could “see” the structure, the writing would start to flow, and seem to finish itself. I thoroughly agree that the structure has to come first. You may still want to index it or append additional metadata tags that you did not think of initially, but without the basic structure, how do you know where to start?.
Thanks for the comment. You are absolutely right, plans and outlines are metadata, they are created before the content, and they shape how it is written. The difference between structured and unstructured writing is not so much that structured writing has metadata and unstructured does not, but that structured writing unites the metadata and the content, whereas unstructured writing separates them, and often disposes of the metadata after the writing is done. But people trying to figure out where to start in the structured use of metadata should indeed look to all the planning and outlining tools they have used in the past as a starting point for defining the kinds of metadata they need to collect.
This is a really nice way of putting into perspective the real difference between structured and unstructured authoring. Especially, for those who believe that structured authoring is only for big organizations where content reuse and information mapping are a prerequisite.
I just want to add here that considering that this is a one-time activity, the approach may change for authors after the schema is created and things have been streamlined in terms of metadata tagging and structured validations.
Thanks for the comment, Preeti,
Personally, I’ve always felt that structured writing is harder for large organizations than for small ones. Large organizations often believe that the only way to make structured writing affordable is to develop a single structure for use across the entire organization. But it is a gargantuan activity to develop a single structure that will satisfy everyone in a large organization, so these projects turn out to be expensive and time consuming, which makes small organizations think that they can’t possible afford them. In fact, smaller organizations with less diverse needs, can have a much easier time developing a working structured writing system. Larger organizations could too, if they stopped trying to please everyone with a single monolithic system.
Pingback: Metadata standards for sharing on social networks | Information Design