Links are expensive. That’s a problem, because the web is, and always has been, a hypertext medium, a medium of links. Links, as I have argued previously, are the last mile of findability. Links are how readers move around in your content. More importantly, links are what keep readers in your content, rather than Googling off to who knows where. SEO is how you get eyes on your content; links are how you keep them there.
What if you could create more links in less time? You can. I call it soft linking.
Creating links by hand takes time. First there is the discovery time, the time it takes to find a topic to link to. The larger the set of topics you are dealing with, the longer discovery takes for each link. Then there is maintenance time, the time it takes to fix all the links that are affected when a topic they link to changes, splits, or moves. Links are created and maintained by inspection, and the more topics you have, the more topics you have to inspect.
Content management systems can help with both discovery and maintenance, but only to a degree. Through search, they can help you narrow down the topics you have to inspect. But you still have to inspect the topics that the CMS helps you find to ensure they are what you want to link to. And even with a CMS, the cost of links still increases as the number of topics increases.
Inspection is not a good way to handle large data sets. In particular, inspection is not a good way to handle frequent operations on large data sets. If you want rich linking in your topics, that is, a link to every subject, concept, or object that you mention that the reader might want to know about, that could easily mean 10, 20, or 30 links in a single topic. Creating that many links in all of your topics by inspection, even for a small data set, is virtually impossible.
There are a couple of ways to get around this. One is to crowd-source your linking. Wikipedia articles tend to be richly linked, because Wikipedia has thousands of editors constantly adding to and maintaining those topics. The other is to form your links by queries. Amazon pages are full of links, all of which are formed by querying their underlying databases.
You probably are not in a position to crowd-source linking for your documentation set. But you could use queries. The basic technique is very simple and its use goes back at least to the era of the encyclopedic CD-ROM, where it was used to assemble and link content from many different sources. Here’s an example from that era, a passage from the review of Rio Lobo in Leonard Maltin’s Video and Movie Guide (Signet, 1994) that became part of a famous movie CD-ROM:
<p>Hawkes' final film is a lighthearted Western in the Rio Bravo mold,
with the Duke as an ex-Union colonel out to settle some old scores.</p>
Here’s the same passage with the significant subjects that are mentioned marked up for soft linking:
<p><director name="Howard Hawkes">
Hawkes'</director> final film is a
lighthearted Western in the <movie>Rio Bravo</movie> mold, with
<actor name="John Wayne">the Duke</actor> as an ex-Union colonel
out to settle some old scores.</p>
There is no explicit link markup here. Instead the passage uses what I call “mention markup” to make explicit what subjects particular words are referring to. This mention markup captures the information we need to form a query. Here’s what such a query might look like in SQL:
SELECT topic.name FROM topic WHERE index.type='actor' AND index.key='John Wayne'
Here’s the same query expressed in XPath:
This query would return the name of a topic whose index contained an entry of the type “actor” with a key value of “John Wayne”
Now, when your build scripts encounter the markup <actor>John Wayne</actor> or <actor name=”John Wayne”>the Duke</actor>, they run a query against the available topics to find a topic to link to. The link is formed automatically, and without the author of the original piece having to do any inspection of the repository to find the topic on John Wayne.
That’s it. Link formed by query instead of by inspection. More links in less time. Rich reliable linking for all of your topics — links that help to keep the reader’s eyes in your content.
There are obviously a couple of other details you have to deal with to make this robust. There are also a number of additional advantages to talk about. But these are matters for another day.
Thoughts? Comments? Questions?
Edit: March 6, 2012 — I have changed the term “reference markup,” originally used in this post, to “mention markup” as I have found that “reference markup” tends to get confused with the markup of references.