I took a side swipe at the DIKW (Data Information Knowledge Wisdom) pyramid the other day, and included a link to David Weinberger’s excellent debunking of it, which concludes:
The real problem with the DIKW pyramid is that it’s a pyramid. The image that knowledge (much less wisdom) results from applying finer-grained filters at each level, paints the wrong picture. That view is natural to the Information Age which has been all about filtering noise, reducing the flow to what is clean, clear and manageable. Knowledge is more creative, messier, harder won, and far more discontinuous.
That knowledge is “more creative, messier, harder won, and far more discontinuous” is very much at the heart of everything I have been saying in this blog, and in the book, about Every Page is Page One and Bottom-up Information Architecture. Hypertext does a far better job than hierarchy at making navigable the genuine messiness of content relationships.
That messiness is not a sign of bad content, but of content that accurately maps the interesting and essential messiness of the real world with which we must grapple every day. To tidy it up into neat rows and columns would be to make it false to the world it is supposed to describe. (None of which is to excuse content that is just plain messy, without reflecting the messiness of the world at all, except perhaps as an example of it.)
But there is another problem with the DIKW pyramid, and that is that it puts data at the bottom, as if data were the most basic stuff on which information, knowledge, and wisdom were built.
Now I am not going to deal with information, knowledge, and wisdom here. I have said previously that it makes much more sense just to talk about stories. Given that, the problem becomes, what comes first, the data or the stories.
Per the DIKW pyramid, data comes first and stories (information, knowledge, and wisdom) are built from data. But there is a huge problem with that. Without stories, there is no way to know what the data means, or even what it is.
34, 8, 94, sbkrt, 9.0003
There is some data. Can you derive a story from it? Of course not. You can’t derive a story from it because by itself it is just numbers and strings of characters. Until you know what those numbers and characters represent, what there are 8 or 94 units of for instance, the data is meaningless (and therefore, arguably, not data).
So you have to put the data in context before it can be interpreted in any way, before you can derive a story from it, (before, arguably, you can call it data at all). And how do you do that? There is only one way: you have to tell a story that explains where the numbers and letters come from and what they represent. Story, not data, then, is the thing at the base of the pyramid, and also the thing at the top of the pyramid.
Which is to say, of course, that there is no pyramid. Information cannot be built out of data because without information on the source and meaning of the data — the story of the data — no information can be gleaned from it.
Data, then, is simply the extension, formalization, and tabularization of elements of a story, and it remains useful only in the context of the story from which it is extracted. Subtract the story, and the data becomes noise. Noise is data minus story.
This is why every database must have its data dictionary. Without the data dictionary to tell the story of the data, you are just storing noise, and cannot create a story from it.
One of the common tropes of the mystery genre is the reconstruction of the story behind a partial message, usually either a fragment recovered from a fire or an incomplete message written in the victim’s dying blood. The detectives try on different stories to make sense of the data, each leading them to the discovery of new stories, which eventually suggest the right story behind the unexplained clue. (GK Chesterton’s story, The Honor of Israel Gow brilliantly parodies this trope.)
Once the true story behind the data is discovered, it is then possible to derive a new story about who committed the crime. But that new story does not come from the unvarnished data, but from the story behind the data. If there is a pyramid, it is stories at the bottom, in the middle, and at the top. It is stories, like turtles, all the way down.
Pulling data out of stories and tabularizing and sanitizing it can allow us to perform many very useful functions, and is foundational to the information age. But as soon as the data comes adrift from its data dictionary — from the story from which it arose and to which it belongs — it becomes mere noise and the machinery of the information age grinds to a halt.