The other thing wrong with the DIKW pyramid

I took a side swipe at the DIKW (Data Information Knowledge Wisdom) pyramid the other day, and included a link to David Weinberger’s excellent debunking of it, which concludes:

The real problem with the DIKW pyramid is that it’s a pyramid. The image that knowledge (much less wisdom) results from applying finer-grained filters at each level, paints the wrong picture. That view is natural to the Information Age which has been all about filtering noise, reducing the flow to what is clean, clear and manageable. Knowledge is more creative, messier, harder won, and far more discontinuous.

That knowledge is “more creative, messier, harder won, and far more discontinuous” is very much at the heart of everything I have been saying in this blog, and in the book, about Every Page is Page One and Bottom-up Information Architecture. Hypertext does a far better job than hierarchy at making navigable the genuine messiness of content relationships.

That messiness is not a sign of bad content, but of content that accurately maps the interesting and essential messiness of the real world with which we must grapple every day. To tidy it up into neat rows and columns would be to make it false to the world it is supposed to describe. (None of which is to excuse content that is just plain messy, without reflecting the messiness of the world at all, except perhaps as an example of it.)

But there is another problem with the DIKW pyramid, and that is that it puts data at the bottom, as if data were the most basic stuff on which information, knowledge, and wisdom were built.

Three turtles of varying sizes stacked on top of each other with the largest at the bottom.

Now I am not going to deal with information, knowledge, and wisdom here. I have said previously that it makes much more sense just to talk about stories. Given that, the problem becomes, what comes first, the data or the stories.

Per the DIKW pyramid, data comes first and stories (information, knowledge, and wisdom) are built from data. But there is a huge problem with that. Without stories, there is no way to know what the data means, or even what it is.

For example:

34, 8, 94, sbkrt, 9.0003

There is some data. Can you derive a story from it? Of course not. You can’t derive a story from it because by itself it is just numbers and strings of characters. Until you know what those numbers and characters represent, what there are 8 or 94 units of for instance, the data is meaningless (and therefore, arguably, not data).

So you have to put the data in context before it can be interpreted in any way, before you can derive a story from it, (before, arguably, you can call it data at all). And how do you do that? There is only one way: you have to tell a story that explains where the numbers and letters come from and what they represent. Story, not data, then, is the thing at the base of the pyramid, and also the thing at the top of the pyramid.

Which is to say, of course, that there is no pyramid. Information cannot be built out of data because without information on the source and meaning of the data — the story of the data — no information can be gleaned from it.

Data, then, is simply the extension, formalization, and tabularization of elements of a story, and it remains useful only in the context of the story from which it is extracted. Subtract the story, and the data becomes noise. Noise is data minus story.

This is why every database must have its data dictionary. Without the data dictionary to tell the story of the data, you are just storing noise, and cannot create a story from it.

One of the common tropes of the mystery genre is the reconstruction of the story behind a partial message, usually either a fragment recovered from a fire or an incomplete message written in the victim’s dying blood. The detectives try on different stories to make sense of the data, each leading them to the discovery of new stories, which eventually suggest the right story behind the unexplained clue. (GK Chesterton’s story, The Honor of Israel Gow brilliantly parodies this trope.)

Once the true story behind the data is discovered, it is then possible to derive a new story about who committed the crime. But that new story does not come from the unvarnished data, but from the story behind the data. If there is a pyramid, it is stories at the bottom, in the middle, and at the top. It is stories, like turtles, all the way down.

Pulling data out of stories and tabularizing and sanitizing it can allow us to perform many very useful functions, and is foundational to the information age. But as soon as the data comes adrift from its data dictionary — from the story from which it arose and to which it belongs — it becomes mere noise and the machinery of the information age grinds to a halt.

 

, , , , , , , ,

4 Responses to The other thing wrong with the DIKW pyramid

  1. Alex Knappe 2015/07/28 at 11:16 #

    Hi Mark,
    I followed the last few posts and always had a strong feeling, you were not really getting down to the point.
    Stories you say, is what make up data, taxonomy and terminology. I would like to say, that this is only part of the truth.
    Stories, data and whatnot are based on emotions paired to them.
    These emotions mostly consist of pictures, but also feelings, smells, noises and everything else we are able to experience.
    This is how our brain works, and it is also what we make use of, when our language fails to do its job.
    Lets take the assembly instructions of IKEA for example. They work (mostly) for everybody, not using any language at all.
    They are arranged to form some sort of story, but each picture itself will also work for itself.
    Even if you would say, that each picture already includes its own story, it would still be the picture forming that story in your own mind.
    But this is not a one way trip only. Also stories will form some picture in your mind or will trigger some emotion.
    That is, because those stories derive from pictures or emotions already, which you combine to a new picture or emotion in your mind.
    I’ll give an example:
    When I say “red”, you won’t immediately associate a story with it – but you will have a certain nuance of the color red before your inner eye.
    That nuance is varying individually (you do have a point there) – but in general you can say that it will be about the same for everyone.
    While this is a mono-colored picture, other words might already set up a whole story for you and a more diverse picture: “rainbow”.
    For most people this will force a picture of a rainbow to their inner eye plus a idealized surrounding landscape and even bring up other feelings and emotions.
    Some will think about the myth of a golden treasure linked to it, others will smell the rain, and again others will have completely different associations.
    If you think about it for a moment, you could write a book about all the things that come to your mind in that single moment, when you read this word.
    This is were the saying “A picture tells more than a thousand tales” kicks in.
    Ambiguity of (written) language is best countered by a picture.
    Lets say an Inuit would be in the position to write some instructions about a specific type of snow in his mother tongue.
    This set of instructions than would have to be translated to other languages. For most languages you’d get some very ambiguous results, for some no result at all.
    It wouldn’t be much of a problem, if those instructions would embed a set of pictures of the snow, so the reader is able to make a picture of it himself.
    If you translate this back to the discussion, it is not so important to establish a story – it is much more important to draw the right picture for the reader.
    Be it by the means of telling a story, or by using pictures directly, or whatever means we can think of.
    It is not important how we achieve it, it is important that we achieve it.

    • Mark Baker 2015/07/31 at 16:32 #

      Thanks for the comment, Alex.

      Agree, stories produce responses in the mind. That is their job. I’m not sure that I would grant that those responses are always emotions, but then I suspect we might get bogged down in what constitutes and emotion.

      Pictures can certain resolve some kinds of ambiguity, though not all. They can also be ambiguous in their own right.

      You can certainly tell stories with pictures. And I think that the same is true of the symbols you use and the images you depict: they can have many connotations be form meaning when juxtaposed in a certain way in a picture or sequence of pictures.

    • Scott 2015/09/04 at 17:21 #

      Alex, I think what you’re driving at still leads us back to stories.

      When you speak of drawing pictures, you’re telling a story, but it suffers the same limitations of written language as images are just as steeped in cultural context as words are, and you’re hoping a signifiers and signified line up.

      You mention “red”. The word “red”, with no context has no meaning. The colour red, with no context, also has no meaning. But, if you presented a thousand Chinese people a red envelope, and a thousand Germans, you’d see the number of smiles on the Chinese said dramatically outweigh the Germans.

      This isn’t because the Chinese have an innate reaction to the colour red that the Germans don’t, but because they’ve created the story “red envelope = money gift”.

      Without that story, we don’t know how to approach most-any constructed sign. We have fight-or-flight reactions to a lot of natural stimuli, but human communication is by-far dominated by our constructs, not our natural environment.

      Images are just another form of data that holds little value until we’ve attached our stories to them. Those stories aren’t universal, and we must respect the work required to craft visuals for our audiences just as we do written communication. This includes putting the story first.

Trackbacks/Pingbacks

  1. Story and data: yin and yang | Leading Technical Communication - 2015/07/31

    […] some background: Mark Baker recently wrote a piece in which he rightly derided the DIKW (data, information, knowledge, wisdom) pyramid as a model for communication. He pointed out that pure data, without a story to give it context, […]