Web Organization is not Like Book Organization

By | 2013/03/19

One of the most difficult aspects of moving content to the Web is that webs are not organized like other things — books in particular. And the difference is not small. It is not that web organization is somewhat different from book organization. It is so different that you can’t even look at web organization the way you look at book organization.

And that may be the biggest problem in moving content to the Web. We are used to being able to look at the organization of our content in a particular way, from the top down, and that does not work on the Web. That makes the difference very difficult to get used to.

Traditional book organization is either linear, or hierarchical. The organization is expressed through a table of contents, which is either a simple ordered list or a nested/hierarchical list like this:

Content arranged hierachically, as in a TOC.

Content arranged hierarchically  as in a TOC.


In this organization, you can look at the organization as a whole, from the outside, as it were, looking down on the work.


A TOC provides a top-down view of the organization of the content.

In a book, the organization is not apparent from an individual page. A page might, at most, contain headings that locate the page in a particular chapter or section, but it does not show you how that chapter or section is related to the rest of the book. For that you have to turn to the table of contents — a separate page or set of pages that describes the organization. The organization is clear and visible from the outside, but invisible from the inside.

Page of a book.

The organization of a book is not apparent from the individual page.

Tri-pane help systems change this somewhat. They place the TOC in a pane alongside the page. You can now see where the page fits in the organization of the book because the content page and the TOC are side by side. (Some Web based help systems don’t keep the TOC in sync with the current page if the user uses non-TOC links to navigate, so then you are back to separate page and TOC.)

Tri-pane help system

Tri-pane help puts the page and the TOC side by side so you can see the organization of the whole help system.

But, as Tom Johnson has noted, when you start to pour more and more content into the help system, the TOC becomes so large that it becomes unwieldy:

Browsing also becomes problematic when you have 4,000 topics in the table of contents (the one-stop-shopping model). Browsing through books and sub-books and sub-sub books and sub-sub-sub books to find the right topic is tedious.

If you expand out all levels of the table of contents, the information starts to look really complex. Users may feel intimidated and overwhelmed about where to even begin.

It becomes more and more difficult to get a sense of the organization of the whole in one glance, from the top down. There is simply too much stuff. And if you deal with the volume by introducing more layers of hierarchy, finding the position of any particular piece of content becomes less and less intuitive as the organization of the hierarchy becomes more and more arbitrary.

Webs don’t work this way at all. Pages in a web do not hang off a table of contents like ornaments on a Christmas tree. Web pages link to each other along multiple lines of subject affinity. (I introduced the concept of subject affinity in A New Approach to Organizing Help.) There is no page one in a web, and no set order of pages. If there were, it would not be a web. It is the multiplicity of links and the lack of a fixed starting point that define any system as a web rather than a sequence or hierarchy.

This is what happens if you try to draw a map of a web:

Map of a simple web

With only a few nodes, the map of a web remains comprehensible.

With only a few nodes in the web, you can make some sense of its overall organization, but when you add more nodes and more links, all semblance of order, and all hope of comprehension, disappears from the top down view:

A complex web.

When a web grows large and complex, its organization becomes incomprehensible when viewed from the top down.

We saw with books that you could not see the organization from the individual page, you needed to look from the top down, at the map of the book created by a table of contents. With a web, the exact opposite is true. You cannot see the organization from the top down. But, in a properly constructed web page, the organization is contained in the page itself.


A Wikipedia article.

A properly constructed web topic shows how the web is organized around it, and allows navigation along multiple subject affinities.

This Web page, the Wikipedia article on the Manicouagan Crater, shows how it is related to the web to which it belongs in multiple ways. On the left, it shows its relationship to the same subject in different languages. On the right, it places its subject in space and provides the standard metadata of its subject area, such as age, diameter, and whether the crater is exposed. In the body, links connect it to all the significant related subjects that are mentioned in the text. At the bottom, the subject is located in the structure for the two academic disciplines which study it: astronomy and geology. Thus it is mapped into the web in which it resides along multiple lines of subject affinity.

Look at those relationships, and all the other crisscrossing relationships of all the other topics in the web, from the top down, and all you will see is chaos. Look at the individual topic, however, and its position in the Web, its organizational relationship to the Web, is crystal clear.

Of course, the page does not contain the organization of the entire Web. In the nature of a Web, that total view is impossible. What it contains is the organization of its vicinity within the Web. When you navigate a web, you do not go from the top down. You go from one node to a related node. You navigate through the web from one page to another along the lines of subject affinity expressed by the links in the content.

Letting go of the top-down view

For the author who has made their career writing books, this lack of a top-down view can be the most difficult thing to adapt to when they start to write for the Web.

Authors rightly feel that it is their responsibility to organize content for readers. This does not necessarily mean that the author expects readers to read sequentially, but at least they feel that they should provide clear navigation that allows the reader to easily select and navigate content for themselves. But how can you tell if you have done a good job of that — and how can you demonstrate to your boss that you have done a good job of that — if you cannot see the map of your content organization as you work, and they cannot see it when you are finished?

The writer’s desire to produce a visibly organized set of content is therefore very understandable. And yet, completing that task in a Web environment has proved an insurmountable challenge. There is too much content and the significant relationships between pieces of content are too numerous to be flattened into a comprehensible or navigable hierarchy. The bottom line is, it does not serve readers well.

Any top-down navigation scheme for content becomes untenable as the size of the content set grows. (This is why encyclopedias have been in alphabetical order for the last several centuries. None of the many schemes proposed for a universal tree of knowledge have proved remotely workable.) Dividing content into several books allows the author to create a top-down organization for each book, but as soon as the reader has a problem or an inquiry that requires them to go out of the current book, the are dropped into the arduous task of finding the right book.

Fobbing off the reader onto the librarian

Book finding is outside the author’s responsibility. It is the librarian’s field of expertise, not the writer’s, and so producing a book with a good top-down organization can give the writer a feeling of satisfaction that they have done their job well. But for the reader whose needs and interests span more than one book, navigating between books is a far more onerous task compared to searching within a book. If we take the whole of the reader’s need into account, we see that merely providing isolated top-down-structured books is not meeting their whole need at all.

A fully traversable web does far more to meet the user’s total information needs. It does so by search and by linking. Search is essentially a jump out of the web and a parachute back into it at a different point. Links are the pathways of the Web itself. They are the strands that define a web as a web. Links can take the reader both far and near, along all the various lines of subject affinity that converge in a topic. The parachute jump that is search often does not land you in the precise place you want to be, but links can then take you the last mile of your journey.

If the links are missing, people can construct them for themselves using the search function, though this is necessarily less exact, and more time consuming for the reader. Rich and accurate linking provides a more navigable route through the immediate vicinity and distant affinities of the content. (Jared Spool has recently published research results that indicate that links provide better navigation success than site searches.)

Adopting the local, bottom up view of content organization

Web organization is always local, therefore. Not local in the sense that one page only links to pages close to it (if the concept of closeness has any meaning in cyberspace), but in the sense that the navigational and organizational clues and tools that it provides are particular to the individual page. They are local to its subject matter. Each page has its own set of associations and affinities. Some of those affinities are local, and some are distant, but they are all related to the current place.

This contrasts markedly to books in which local navigation, though possible, through footnotes and cross references, is rare and usually entirely absent. The only organization and navigation provided by most books is global. The reader who wants to trace a subject affinity, whether inside or outside the book, is generally given no local navigational options; they must go to the global TOC or index.

To describe the book’s organizing principles as “global” though, means only global to the book itself. They are not global in the sense of being based on a global information resource or a global view of the subject matter. Content on that scale defies global organization, meaning that only a web can organize content on that scale, and that it does it not with a global view, but with a local view based on the subject affinities of whatever content object the reader is currently reading.

And, of course, it is likely that wherever the reader is now, unless they are hopelessly lost and in the entirely wrong place, the place they will want to go next is a place that has one or more subject affinities with the place they are now. (If they are lost, then search is their friend.) By focusing on the subject affinities of the reader’s current location, a web provides a far more usable structure and navigation of a vast content set than the book’s top down approach could ever manage.

Most emphatically, the switch to web-style bottom-up organization does not mean an abandonment of organizing content as part of the writer’s task. But the writer can fairly ask how they are suppose to organize when they cannot see the organization they are creating. The answer is that an organized web requires a set of organizing principles that are based on cataloging the types of subject affinities that exist in the content set and setting and enforcing policies on how those affinities are expressed and related in each topic in the web. The use of appropriate structured writing tools can help enormously in enabling, enforcing, an auditing content based on these principles.

This, then, is the great hurdle that tech comm needs to overcome to really start delivering content to the Web. It has to learn how to break itself of the top-down organizational schema of the book world, and learn to adopt, to create, and to manage, the bottom-up organizational schema that supports an information collection as vast, as vibrant, and as fluid as the Web.


15 thoughts on “Web Organization is not Like Book Organization

  1. Alex Knappe

    Nice post, Mark. You’re speaking of rich linking and local navigation, which is very helpful if done right.
    But this is a problem. Doing it “right” is a single point of view, even a matter of taste. It is a legacy issue, we carried over from the book world to the web world.
    Even plain searches encounter this problem. While some people do have a strong “Google-Fu”, others don’t.
    I’m encountering this issue every day on forums, where people try to use the search functionality, but seem not to find what they were seeking – even if the same question has been answered dozens of times already and the information they seek is definitely out there.
    The same issue arises with links. People put expectations into links, they follow them, stray away from their original path if their expectations aren’t met and get lost.
    What could we do to prevent this? Is there a way how we could avoid frustration on the readers side?
    I think there is a way. What we need is a hybrid of search and link functionality – something I would call dynamic linking. Essentially, dynamic linking would create links within a topic, based on the search pattern and previous page calls on the individual reader.
    While you have to use the right terms for a successful search or have a vague idea where a link leads you to, dynamical linking would combine your search pattern to lead you to potentially relevant other topics.
    This asks for distinct algorithms, that extrapolate what you are really searching, but the technique already exists. Google places adds based on your individual search patterns. Online shops propose products, based on your earlier page visits.
    Dynamical linking is all about pattern recognition and combining it with the writers own mindset. Most like a translation writer/reader – reader/writer.

    Tagged books, content, Every Page is Page One, hierarchy, navigation, organization, readers, subject affinity, the Web, writers.

    This is your mindset. Mine might have been something else. Dynamical linking would bring those two together.

    1. Mark Baker Post author

      Thanks for the comment, Alex.

      I’m not convinced by the case you make for dynamic generation of links, for several reasons:

      1. However much information we may be able to gather about the user (and, of course, users are trying harder and harder all the time to restrict the amount of information we are allowed to collect about them), I’m not convinced that we will ever know enough to know what their particular current task is.

      The sites that like Amazon do a fair job of creating a general profile that is useful for general recommendations, but these algorithms are easily thrown off by things related to specific tasks. For instance, my wife loves travel books, so I occasionally buy one from Amazon as a present for her. This throws a monkey wrench into the recommendation system because it tries to aggregate those purchases into my profile and starts to suggest books that combine travel with my other interests.

      To make it all work, Amazon would need to ask me when I was buying a book for my wife, and ideally combine what I buy for her with her profile and use her profile to suggest books I should buy for her. Except that my wife never buys books for herself from Amazon, so all her purchases are for family members, and her profile has no relationship to her tastes at all. So then she would have to specify who she was buying for, etc. etc. Privacy concerns aside, people are not going to go to that much trouble.

      2. Most linking is based on subject affinities, not user tastes. Which links users choose to follow is based on their tastes, but that does not create a need to user-specific link destinations. In fact, it generally avoids it.

      3. We have not remotely developed a practice of adequately and consistently linking the content we put on the web along lines of subject affinity, so it is a bit soon to conclude that we need to do something more complex. Let’s get really good at consistent static linking before we decide we need to do more.

      4. I don’t think it is our job to keep the user from getting distracted. They are grownups. If we were building shopping sites, of course, we would have a commercial interest in trying to herd the user to the checkout, but in the tech comm space, where we cannot even be sure what the user’s current goal is, we should focus on giving them the options they need to direct themselves.

      5. There is nothing that says every link must have one and only one destination. If there is more than one resource a user might want when they click on a link, we can present them a popup with multiple choices. (When you implement a system to generate links automatically you need a system like this anyway).

      6. Systems that change every time you use them are frustrating for users because they rob them or the ability to navigate by memory. If such a system worked perfectly every time, the user might not mind, but that is not remotely possible, having the system’s behavior vary as it tries to intuit the readers intent will only make the user’s experience more frustrating.

      Making a static content set truly navigable according to the individual user’s individual need is something we have yet to do consistently, but from a UX point of view, it offers a combination of simplicity and flexibility that should provide for a great UX. We just have to learn to build it.

  2. Maish Nichani

    Nice stuff Mark! As we deal with increasing volumes of content on websites and intranets this approach becomes a viable way out. We explored a similar concept sometime back and used a framework for gathering content for target pages, what wikipedia calls ‘topic’ pages. See, http://pebbleroad.com/perspectives/taming-your-target-content

    The framework revolved around three types of content: Pertinent, Relevant and Action. Pertinent=core content, the real body. Relevant=Neighbourhoods. Action=stuff you want to do with the content, e.g. share, compare, calculate, etc.

    When it came to writing such pages, we again used the framework to teach intranet authors on crafting such pages.

    Would love to know how you manage the writing of such pages.

    1. Mark Baker Post author

      Thanks for the comment Maish.

      I like your concept of target, relevant, and action content. I wonder if the difference between target and relevant content is absolute or dependent on the reader’s interest. For instance, for a reader with a different interest, the listing of all the scores in the tournament might be target content and the individual match reports the relevant content.

      I think the notion of closure for target content is very important, and I think the way you achieve it is through structured writing, both in the writing-conforms-to-template sense, and in the data-conforms-to-schema sense. I think that approaches that use generic schemas miss a huge opportunity to define and enforce a structure that is known to achieve closure.

  3. Pingback: Weekly List Bookmarks (weekly) | Eccentric Eclectica @ ToddSuomela.com

  4. Pingback: The Nonlinear Organization of Web Content | Beyond Help

  5. Pingback: The Nonlinear Organization of Web Content » eHow TO...

  6. Roy MacLean

    Mark – totally agree with this post.

    If the ‘subject affinities’ of a page – cross references, keywords, formal names – are represented explicitly in the page mark-up (say XHTML), then it’s possible to generate sets of links via query (I’m experimenting with XQuery). These link sets can be generated as static pages – a kind of indexing of a set of content pages – or perhaps dynamically. A link set can be generalized as a function from a Source page and a (typically large) Context set of pages to a (typically small) subset of the Context.

    This means that I can have links not only to pages that a Source page references, but also to pages that reference the Source (inversion), or have an indirect relationship with it (e.g. pages that reference the same pages that the Source page references – pages that are ‘like me’ in some way). This is fairly obviously applicable to ‘model-based’ documentation. I’m also looking at how it could be applied to ‘narrative navigation’: how to have ‘next page’ links that reflect different reading strategies – such as level of detail or page type.

    1. Mark Baker Post author

      Thanks for the comment Ray.

      Personally, I’m not particularly interested in tracing back links. I’m interested in removing linking as a concern for the author and having links generated based on the indexes of topics and the semantics of the topic markup. In other words, links express relationships between subjects, not between topics. In that same vein, I don’t think it is our business to be offering a “next page” link at all, whether static or dynamic. It is up to the reader to determine what they want to read next. Every Page is Page One.

      1. Roy MacLean

        I think we’re in agreement with your first point: “links generated based on the … semantics of the topic markup”. As regards the latter notion, I’m wondering how this applies when there is some ‘narrative’ or ‘logical’ ordering within the content. For example, if I have topics describing a problem (in a maintenance task, say), and solutions, then I probably want to read the former before the latter. It might be useful to offer (no more than that), links to ‘more detail on the problem’, and ‘move on to the solution, at this level of detail’ (generated as before from the content relationships).

        1. Mark Baker Post author

          Roy, yes, if the content has a prescribed order, and if you have chosen to present it as separate topics with a prescribed order between them (rather than keeping everything with a prescribed order in one topic), then you will clearly need links that implement the prescribed order.

          On the other hand, if you adopt an Every Page is Page One information design, then the first principle is that there is no prescribed order of topics, and that a topic (as presented to the user) is never less than the narrative minim.

          In this kind of design, all the links should be based on subject affinity and it is the reader, not the writer, who decides in what order the reader will consume the content.

          Adopting an EPPO design approach is based on the recognition that, do what you may, you cannot actually enforce a prescribed order on readers, and that they will jump around pursuing their own agenda in even the most strictly linear material (as John Carrol observed over 20 years ago).

          So, EPPO essentially says, if you can’t control them, facilitate them. Thus no prescribed order between topics, and rich linking based on subject affinity to assist readers in selecting their own path.

  7. Pingback: Findability is a Content Problem, not a Search Problem - Every Page is Page One

  8. Pingback: Critical Analysis 6 – DDSN360 Fall'19

  9. Pingback: DDSN 360: Critical Analysis 6 – Designs by Day

  10. Pingback: Critical Analysis 6 – Marlena’s blog

Leave a Reply