Structured vs Unstructured Hypertexts

One of the questions I am often asked about Every Page is Page One is whether it simply means write articles instead of books. But while articles are certainly much closer to the EPPO model than books, there is something more to EPPO than simply writing articles. EPPO is also about the relationships between articles/topics/pages (or whatever else we decide to call them).

A single article is seldom sufficient to cover a large subject. You often need to create a much larger content set, consisting of many articles/topics/pages, to cover a subject adequately and to meet your audience’s varied needs. In the age of the Web, however, in the age of Google and information foraging, a linear or hierarchical organization of content is no longer adequate, and does not match how modern readers approach content. We need another way to approach the organization of content — one I have termed bottom-up information architecture.

But a bottom-up  information architecture is more than a bunch of articles with maybe some links between them. To make this clear, I want to draw a distinction between two kinds of hypertext, which I will call structured and unstructured hypertext.

Unstructured hypertext

A hypertext, generally, is a set of pages that are related to each other in non-linear ways and that you can read in any order you choose. Initially, to create a hypertext, the pages themselves had to contain links to each other, since that was the only way either to express or navigate the relationships between them. Hypertext networks grew up as writers started to link to each other’s pages. Thus the Web took shape in an informal and unstructured manner as writers put up web pages and linked to other people’s pages.

With the advent of the search engine, particularly of search engines capable of sophisticated detection of the subject matter of a page and sophisticated interpretation of a query string, we ceased to need explicit links to connect pages. A reader could summon a set of related pages from a search engine, and use search to navigate from a subject mentioned on one page to a article on that subject, which might come from half way across the web. Various forms of curation, social sharing, and social proof added to the variety of ways in which relationships between pages could be expressed.

With the aid of search engines and these other mechanisms, the Web became a vast unstructured hypertext, which has changed the world. The Web’s lack of structure has scandalized many, and continues to do so, but its overwhelming world-changing success has demonstrated the extraordinary power of unstructured hypertext. David Weinberger’s book, Everything is Miscellaneous provides the best explanation I know of for this seemingly paradoxical success. In a world far too miscellaneous to be reduced to a single system (Too Big To Know, in another Weinberger phrase and title) the ad hoc organizing activity of millions of readers, not to mention a great many algorithms, achieves a level of findability and relationship discovery that ordinary forms of organization can’t match, and one that is genuinely navigable by almost everyone.

And an inestimable bonus of this collaborative bottom-up organization of the unstructured Web is that to create an unstructured hypertext (or, more realistically, to contribute to the vast unstructured hypertext that is the Web), all you have to do is write a page and post it. The Web takes care of the rest.

It helps, of course, if you act as a good Web citizen and throw in some links, both internal to your content, and to other people’s content. But even if you don’t, if your content is any good, people will find it and link to it and Google will find it and index it, and it will be part of the world’s great unstructured hypertext.

In creating a page to succeed in this great unstructured hypertext, it is important to create pages that actually work when people arrive at them in hypertext fashion. This is why every page you write needs too work as page one. You can put up a book, if you want to, and it will get indexed and show up in searches, if someone tells Google where to find it, but it is not going to work well for readers and then, because the Web is a vast filter as well as a vast hypertext, it will sink down in the vast black goo at the bottom of the Web, seldom to find eyeballs ever again.

In fact, that is one of the key defining characteristics of unstructured hypertexts: they work, and they only work at all, because they function as filters. A system that opens itself up to all contributions from anyone who wishes to contribute must operate first and foremost as a filter, or nothing of note or interest would ever be found.

That is the quality process of an unstructured hypertext: bottom-up collaborative and/or algorithmic filtering.

Today, if you want to be read, your content has to work well in an unstructured hypertext environment. You have to get filtered in by the Web. Ignore this arbiter of taste and quality and your audience is going to be small. (Occasionally, of course, you may be fine with a small audience, if it is the right audience. But let’s talk about the rest of the time.) Every Page is Page One is very much about creating pages that work in an unstructured hypertext environment, and if that was as far as you took it, you would do much better on the Web.

Structured Hypertexts

Nonetheless, Every Page is Page One is about something more than creating good pages to add to an unstructured hypertext. But if unstructured hypertexts are so amazing, why would you bother to take the extra steps to create a structured hypertext?

To be clear, I am not one of those people who believe that the Web is fundamentally broken and that the only way to save it is to apply structure across the whole thing. Such arguments simply fail to notice the enormous power that lies in bottom-up collaborative filtering and has built the modern Web. The Web, despite it many anomalies and sometimes dead ends, works brilliantly. Its unstructured hypertext system may well be strengthened by new technologies, but it will not be done away with. Unstructured hypertext is hear to stay.

But while the Web as a whole is an unqualified and unparalleled success as an unstructured hypertext, some of the most valuable and popular properties on the Web are very much structured hypertexts.

By structured hypertext, I mean a body of content that is designed to work together as a hypertext, and that has systematic policies in place to ensure broad coverage goals are met, that topics are chosen and structured to work with and support each other, and that they are connected in a consistent and deliberate fashion.

While you can create a structured hypertext that is entirely self contained and independent of the Web (such as the encyclopedic CD-ROMS of the 90s), if you place a structured hypertext on the Web, each page is also part of the broader unstructured hypertext of the Web, and can benefit from all that being part of the broader hypertext can bring. But this also means that any such page is also an entry point — a landing page — into the structured hypertext. As such, it is potentially capable of capturing and holding the reader’s attention in that structured hypertext, which, if well designed and on topic for the reader, should be significantly easier and more rewarding for the reader to use.

The important thing here is that the reader is not switching navigation modes. They are not going from hypertext navigation to book navigation or help system navigation or hierarchical site navigation. For them, the hypertext navigation is seamless. They have simply stumbled into an area of superior hypertext (both in terms of navigation and content) where their needs are more likely to be met with less effort.

This is one of the most compelling arguments for taking a structured hypertext approach to your content. Whatever the virtues of other forms of organization you could adopt, if most of your readers arrive from an unstructured hypertext environment, moving onward in your content will require them to switch navigational paradigms. This involves cognitive overhead for the reader, which makes their task harder.

You can argue, if you like, that if they actually used your navigational scheme it would make finding the content easier, but that ignores both the cognitive load of switching modes and the fact that the rest of the Web beckons as an easier hypertext alternative. Climbing one big step may be the shortest route to a goal, but most people will take the long way round with the easier steps. We are far more likely to choose the lowest effort next step than attempt to work out the most efficient overall route. That is just how we are wired. After all, making that calculation is itself part of the cognitive load of way-finding. We are wired to conserve our energy within the bounds of our current horizon. By presenting your content as a structured hypertext, you make the next step for the reader who has arrived using hypertext techniques as easy as possible.

This is exactly what happens when a reader arrives at a structured hypertext like Wikipedia, Amazon, YouTube, or StackOverflow. Consistently structured good quality content with a consistent navigational scheme means that the reader has little reason to wander out of the place they have arrived in.

And this is not about the reader consciously saying to themselves, “Oh, it’s nice here, I think I will stay for a while.” The reader may or may not consciously note that they have arrived in a particular place, or that they are navigating around a local hypertext rather than the wider Web. They probably have not made much of a conscious decision to stay in this local content set rather than return to the Web. After all, this content set is also part of the Web. It is just a part of the web that works a little better for what they are trying to do. They stay because, and for as long as, it continues to work for them.

If filters are the engine of quality for unstructured hypertexts, structure is the engine of quality for structured hypertexts. This is why following a consistent structure, staying on one level, assuming the reader is qualified, and linking richly are all important principles of EPPO topic design. They all deal with creating topics that work as nodes of a structured hypertext.

And this is why there is more to EPPO than writing articles instead of books. Writing articles instead of books will help your individual pages perform much better in an unstructured hypertext environment and that can be a substantial win over putting books online. But that is not enough to capture the reader’s attention in any sustained way. For that you need to create a hypertext experience that is superior, in the local domain, than the experience of the broader Web. Thus a bottom-up information architecture is about creating a structured hypertext.

We can’t structure the whole Web. The diversity of subjects and their relationships, and the diversity of people seeking information, guarantee that at the large scale, everything is miscellaneous, and unstructured hypertext is the best way to access it all. But within a local information domain, a structured hypertext can provide a compelling experience for the reader and help the writer and their organization advance their goals by attracting and holding the attention of readers.

, ,

2 Responses to Structured vs Unstructured Hypertexts

  1. Kel Mohror 2015/06/11 at 20:48 #

    In the paragraph describing “principles” you mention “assuming the reader is qualified.”

    How do you define “qualified”?

    • Mark Baker 2015/06/11 at 23:48 #

      Qualified means the person who normally does the task that the topic supports. Thus a recipe is written for someone who knows how to cook, a knitting pattern for someone who knows how to knit, a JavaScript API reference for someone who knows how to program in JavaScript.

      Inevitably, some portion of your readers will not be qualified, and the temptation is then to add material to support them — tell people how to crack an egg or call a function — and then topics start to lose cohesion, and don’t work as well for the readers who are qualified. Information for readers who are not qualified should be provided by links to other topics, not by including extra information inline.