Sometimes, Readers Do Classify their Experience

By | 2013/04/02

In my last post I argued that navigation based on classification schemes does not work because readers don’t classify their experiences. But while that is generally true, it is important to note that sometimes readers do classify their experiences, and that when they do, it is important that we base our navigation on those classifications.

For instance, here is a used car site that I think works really well,

Autocatch screen capture.

Car buyers classify cars according to the features they are looking for.

Here, car buyers can select the cars they want to view based on the ways that buyers commonly classify the features of cars that are important to them: year, transmission, price, mileage, body style, exterior color, and whether they are sold by a private seller or a dealer.

This scheme works because the typical car buyer already classifies the cars they are interested in this way. That is, the buyer has typically decided that they are looking, for example, for a sedan with under 100,000 km, with an automatic transmission, and no earlier than a 2008 model. They bring the classification scheme with them to the site, and the site responds by letting them select cars to view that match the criteria they already have in mind.

Other examples include

  • movie rental sites that allow you to select movies by genre, star, rating, awards, live vs. animation, etc.
  • medical sites like Web MD that allow users to select content based on where they hurt

Again, these are classifications that the user brings with them to the site. To assume that similar designs will work equally well if they were based on classification schemas that the author made up out of whole cloth, or even one that represents some real truth about the product internals, would be a serious mistake. It is not the particular of the design that make them successful, but the fact that the particular of the design correspond to the particulars of the user’s natural classification scheme.

One thing that is notable about these classification schemas is that many of them are not hierarchical, but consist of multiple independent variables. In Autocatch, the user can choose to narrow down their selection based on any criteria that interest them, and can enter those criteria in any order they feel like. Autocatch would be much less usable if the criteria were arranged hierarchically:

            body style
               exterior color
                  private seller or a dealer

With that kind of organization, a buyer who wanted to see all convertibles regardless of year, transmission, price, or mileage would have to look in hundred of different places to see all the cars they were interested in. Play with the levels of the hierarchy however you like, you will still create a situation in which many buyers will have to look in many places to find what they are looking for.

Where the subject matter itself is hierarchically organized, a hierarchy works. Autocatch’s home screen presents a hierarchical selection of locations for the user to choose from.

Autocatch's hierarchical location navigation.

Hierarchy works for navigation when the subject matter is naturally hierarchical.

This works for two reasons:  locations have a natural hierarchy (Ottawa is physically inside of Ontario), and used car buyers have a natural interest in the location of used cars. There are towns named Windsor and Woodstock in more than one province, but no one wants to look for used cars only in Windsor, Ontario and Windsor, Nova Scotia. The hierarchy works because it is natural, familiar, and useful. Autocatch uses hierarchy where hierarchy works, and independent variables where independent variables work.

WebMD is another example of a natural, familiar, and useful hierarchical approach, with their symptom checker:

The WebMD symptom checker

The WebMD symptom checker uses a natural, familiar, and useful hierarchical classification scheme to allow users to select appropriate content.

This works because it is the digital equivalent of a doctor asking a patient to point to where it hurts. (The fact that the selection is graphical is really important here, because it takes potentially unfamiliar terminology out of the mix). A similar selection mechanism using a classification that was not natural or familiar to the user would not work. The user would be stumped by the first question.

Many examples of natural, familiar, useful classification schemes come from commercial sites. If anyone knows of good examples from tech comm sites, I’d be very grateful to hear about them. But commercial sites have a lot to teach us, because they live under a strict Darwinian selection process. Commercial sites show us which models work for customers because commercial sites that don’t work for customers quickly go out of business and disappear. (Tech comm’s unfortunate lack of exposure to the Darwinian forces of the market place is something I have commented on before.)

There is, I believe, just as much of a natural imperative in cases where the user does classify their experience as in those where they do not. If you attempt to impose a classification scheme on content where the user does not classify their experience, it is doomed to failure. But if the user does classify their experience in a particular field, it is pretty much essential that you provide navigation based on that classification scheme, and you are almost certainly doomed to failure if you don’t. (Imagine a used car site that did not organize cars by year, transmission, price, mileage, body style, exterior color, etc. How successful would it be?)

Part of the problem for tech writers in organizing content according to the classification schemas that users bring to the content, however, is that so many of those schemas consist not of a hierarchy but of multiple independent variables.

You can create a hierarchically organized content set with desktop authoring tools and present them as static web pages. But to enable the user to navigate according to multiple independent variables requires a real database on the back end. A CMS stuffed with content fragments is not going to cut it for this purpose. You need a genuine content database that can be reliably and swiftly queried at runtime in response to user inputs. That is not something that tech comm teams have a lot of experience designing and building. If you have built anything like this for your content, I would love to hear about it.

7 thoughts on “Sometimes, Readers Do Classify their Experience

  1. Alex Knappe

    I’m not so sure, if you aren’t comparing apples with pears here, Mark. While users classify their search when buying a car, or while searching for a movie in a predetermined way, they don’t classify it consistently, if you go more into detail.
    Actually I’m seeing three different levels of search behavior here.
    The first level is the level you described, it’s the level tech comm usually doesn’t get into contact with, as it simply doesn’t exist so much for us – with one exception.
    To clarify it, I’m talking about the level where users search for the outer circumstances: year, brand, version, color. For documents, all these criteria are already determined, except for the document type: operation manual, service manual, faq, troubleshooting guide…
    The level below that is comparable to your movie example: cast, general plot, awards
    For documents that is: TOC, overview, technical data
    Up until this point, we are talking about metadata and is predictable to be searched for.
    The third level is the one you described in your last post and is only predictable on a very vague base but is the content the is the flesh of the subject.
    In terms of used cars this would be: was the car used on long or short trips in the past, did the driver smoke in the car, are there any hidden defects?
    In terms of movies: how good is the acting, is the plot consistent, does it match your preferences?

    If you want these informations you’re in the same position as with documentation – you cannot categorize these informations consistently. You don’t know if a car is comfortable or a movie is good by only looking at the metadata. You don’t know if a car is a wreck, if you didn’t check the motor room and took a closer look. You don’t know if a movie is bad, if you didn’t read some of the reviews with high and low rating (actually most of the action movies I like have an imdb rating around 2.5).

    1. Mark Baker Post author

      Hi Alex,

      I’m not arguing that these forms of classification are particularly useful in tech comm. I’m simply acknowledging that indeed readers do sometimes classify their experience.

      The main point I am trying to establish is that one cannot extrapolate from the success of systems that exploit the classification that the user brings with them to conclude that the same system design based on the writer’s classification scheme will work just as well.

      … the level where users search for the outer circumstances: year, brand, version, color. For documents, all these criteria are already determined, except for the document type: operation manual, service manual, faq, troubleshooting guide…

      This is a perfect example of what I talked about in the last post, about fobbing the user off on the librarian. If the tech writer assumes that the reader is going to go through the kind of product selector you find on sites like router or printer makers, with hundred of different models, and only gets to the content after selecting documents based on these criteria, then they can simply ignore the problem.

      But what if the reader simply Googles for an error message that has appeared on their printer and gets a hit on one of the documents in your system. How then do you establish if they are looking at the document for the right model?

      When you serve an audience that Googles when it gets stuck, you can’t fob off product selection on another function or system. Every Page is Page One.

      The implication here, to wander a little off topic, is that reusing texts like this in multiple documents is fundamentally the wrong approach. What you really want it to publish the text once and show that it applies to multiple models, and, where applicable, link it to the text that applies to other models.

  2. Jonatan Lundin

    You are discussing an interesting issue. Finding a classification scheme that works and feels “familiar” for users is our greatest challenge. SeSAM is a methodology that allows you to build classification schemes based on facets of a user search situation. The classification scheme can be seen as a content filter. As the user narrows down the search situation by doing facet selections, the possible answers are shown similar to a faceted navigation browser. We are building a search user interface built on these principles. I’ve touched upon aspects of working this way here: I can soon provide an updated white paper on SeSAM for those interested.

    But Mark, I believe that you are wrong (this applies to your previous post) when you equal a TOC to a classification scheme. A traditional TOC is often not a classification scheme, as a taxonomy, but a list of content thematics.

    In a linear book fashion, each segment/fragment/block/page often has a thematic title that reveals and summaries the content to help the user quickly understand what the segment/fragment/block/page is all about. A classical TOC is aggregating all the thematic titles into a list. The title is often not a classification facet but just a summary of the content. Some TOCs are instead a meronomy which reveals semantic part-whole relationships, such as product component structures.

    If you have a bunch of topics and build a taxonomical classification scheme based on the characteristics of the topic (size, author, publish date etc), that scheme can be seen as a TOC. But a traditional TOC is often not a classification scheme as you impose.

    Writers, information architects, content strategist are often not clear if, what they claim is a TOC, is a list of thematic titles, a meronomy or a taxonomy or something else. They mix it up. You seem to mix it up. I sometimes do ;-). Many, what seem to be a nice TOC, is a mix where a certain node displays classification taxons and some other node shows part-whole relationships as in a meronomy.

    No wonder why readers have difficulties in finding content in what we call a TOC (or learning the TOC). A certain design pattern, used in a certain node, is often not repeated throughout other nodes in the TOC. Many readers do even have difficulties in finding content in a hierarchical list of semantic names, be it a taxonomy, a meronomy etc, even if the hierarchical list is unified and conforms to one design pattern principle. You provide an explanation in this post to why that is the case.

    A researcher called Taylor described in 1968 the search process a searcher travels through when trying to find the answers to an information need. He claimed that a searcher travels through a four step process and the last step is where the searcher has to compromise and express a query that suits the information system used, to get the information needed.

    When navigating in a TOC, the searcher must associate the expressed compromised label/query to a/each node to decide if the node has something to do with the label/query. Many searchers do have problems in expressing a query and associating it with a certain node due to a lack of knowledge of, for example the classification and how the label/query is semantically related to other terms.

    But a TOC, done as a classification scheme, may serve other purposes than provide a navigation interface. A TOC can enlighten the user once s/he has found the content they need to get a deeper semantic knowledge about a topic of interest (subject classifications). A TOC can signal the type of content a page contains to help the reader judge if the page contains the information sought for. I agree that there is a need to find better ways of providing the subject classification context for individual pages as well as how pages are related from different classification viewpoints.

    1. Mark Baker

      Hi Jonatan. Thanks for the comment.

      I’m not saying all TOCs are classification schemas. Most are not. What I am saying is that a TOC that is not a classification schema only works if it is short enough for the reader to scan quickly for the information they want (without any presumption about where in the TOC that information occurs). Any TOC that is too long to scan (which is what you get when you stuff multiple books into a help system) can only work if it is based on classification schema (meaning that the reader does not have to look through the whole thing, but can follow the hierarchy down to what they want).

      And I am further saying that that does not work either in the general case because the readers don’t classify their experience and so can’t use the classified TOC.

      In the follow-up post, Sometimes Readers Do Classify Their Experience (/2013/04/02/sometimes-readers-do-classify-their-experience/), I note that there are cases where readers do classify their experience, but that in many cases, these classifications involve multiple independent variables and can’t effectively be expressed hierarchically.

      Before the availability of database systems, organizing content based on multiple independent variables was very expensive, and so we tried to subdue classification under a hierarchy, a habit that we seems still to be overly attached to. But these hierarchies are accessible only to the cognoscenti, and the general public is now doing an end run around them with Google with great success.

      1. Jonatan Lundin

        Hi Mark,
        Thanks for the clarification. I fully agree to your statement that readers often are not capable of classifying their experience (but sometimes). Your statement “classify their experience” is another way of expressing a fact that has been known for almost 5 decades in the research community: human do have difficulties in putting a label on their information need (state of uncertainty) and associate or relate a formulated query to semantically related subject areas.

        An information system (such as Google) requires you to put a label on your thoughts, and to formulate it to suit the information system at hand. This is sometimes to ask too much of the user, as s/he is using the information system to get clarification on an anomalous state of knowledge and eventually after using it, know the labels. But modern search engines do have algorithms that do associations and relate to your previous search habits, which improve the search experience such as finding the label from the given search results. The younger generation seems to be much more search literate than what we are in this respect.

        But I do not agree that a TOC “works if it is short enough for the reader to scan quickly for the information they want”. If you lack an ability to express your uncertainty, a short TOC is not going to work either. Furthermore, you seem to impose that readers, who cannot classify their experience, and accordingly cannot use a long TOC, would have no problems of finding what they want using a search engine. That interpretation (my interpretation?) is misleading, as they would probably have the same problems in expressing a proper query (as I previously mentioned) as they would navigating the TOC, since they lack a robust search vocabulary. But I agree that the reader would have somewhat better chances of finding what s/he is looking for using a search engine due to the nature of the engine.

        Finally, I would like to say that a TOC, done properly so that the same design pattern principle is followed throughout (be it a classification scheme, meronomy or just an aggregation of the themtic titles), works well for users who use the structure to navigate on a daily basis. Such user is for example a CMS user. Content can be organized around a TOC in a CMS, and if technical communicators have participated in the design discussion leading to the TOC or if they know the design pattern principles underlying the TOC, they should not have any problems in navigating such CMSs. My company has a CMS product that uses the TOC principle for content organization, and a unique way of navigating it through superfast expand and collapse principles.

        But, as we have concluded, a static arbitrary hierarchy of semantic labels is not working for end users who seldom use the information set and just want to find information fast (without learning the underlying principles), following the “principle of least effort”. Here a faceted navigation or faceted search approach is more suitable.

        1. Mark Baker Post author

          Jonatan, I didn’t say that readers are not capable of classifying their experience, I said that they don’t. To say they are not capable seems to suggest that classification is a proper or necessary thing to do and that to offer any other form of navigation is a concession to weakness. I don’t think that is the case at all.

          If you lack an ability to express your uncertainty, a short TOC is not going to work either.

          I disagree. In many cases we can recognize something even when we can’t express it. Of course, this is not going to work all the time, but if we define “work” as “work all the time” then there are no information systems that work. The only useful definition of “work” for information systems is that they attract repeat use. If user abandon them, they don’t work. If users come back to them, even if they are not always successful, they work.

          I do agree that people will learn the hierarchy of a system they use every day, provided their motivation is high enough (abandonment is not an option) and no workable alternatives are provided (search is ineffective, no system based on multiple independent variables is provided). But that is still like saying that I will dig myself out of prison with a spoon if no pick and shovel are provided.

  3. Audie

    Guess that’s why autocatch is so successful in the business with about 6 million visits a year !


Leave a Reply