  1. cud 2015/09/28 at 14:40 #

    As always, interesting. Am I right in thinking you’re leaning toward a Cabalistic view of the Star of David, where two pyramids — inversions of each other — indicate the transcendence of matter into spirit, and the distillation of spirit into matter… A kind of bi-directional flow of mysterious forces? Not sure if that’s what you’re going for. But I’m very much open to considering knowledge and information in that light.

    I always like to scan for an inflection point in articles like this — one I found is this:
    “…data is precariously balanced on a pyramid of stories, doomed to become noise, or at least to be misinterpreted, by anyone who does not know the exact set of stories that explains it…”

    I’m afraid I can’t follow there because there is no exact set of stories. The point of creativity is unique combination of parts. The point of text (be it mathematical or linguistic) is unique combination of sub-texts. In your parlance I believe this would be the creation of unique stories, something that’s often possible without needing to add new texts (re-use). This property of text is responsible for our technological progress.

    • Mark Baker 2015/09/29 at 08:22 #

      Thanks for the comment, Cud.

      The star of David idea is interesting, but for the moment at least I’m not going there. I’m going for straight inversion of the pyramid. Data is a distillation from stories, and you have to be very wise to extract data from stories reliably, and also very wise to interpret data reliably. Stories come from experience, not from data.

      What I think the DIKW pyramid might be getting at is that wisdom disciplines experience. If we regrard experience as producing data, wisdom recognizes that sometimes the experience is misleading and produces incorrect data. So raw data does not mean much until you run it through the process of turning it into information, knowledge, and wisdom.

      The problem with this view, though, is that is data is unreliable, the wisdom that detects that is is unreliable cannot be derived from the data. Garbage in, garbage out. Our suspicion that our data is unreliable does not come from anomalies in the data because on data is anomalous in itself. It is anomalous with respect to a story. Our suspicion that data may be anomalous arises from inconsistencies between the stories we tell about experience. Data is something we pull out of stories to examine them more rigorously to try to resolve the inconsistencies.

      This, of course, gets us into deep epistemological waters. Whence our ability to detect inconsistencies between stories? Is reason prior to experience?

      But those issues are beyond the scope of this blog. My concern here is the primary role of stories in communication, and the limits of compatibility between the story world of the writer and the data world of the content manager.

      You have a point about there being no exact set of stories, but I’m not sure I see how that is consistent with what you say about creativity and reuse. If there is a finite set of texts and it is possible to be creative by combining existing texts in new ways (reuse) then there is an exact set of resulting texts that can be mathematically circumscribed. But does text = story?

      The story I have been telling about stories over the past several posts is that stories are told by tacit reference to other stories, that language itself is simply a collection of words and phrases that are tacitly understood to refer to stories. And the difficulty of this is that people in different domains tacitly associate different stories with the same words, making misunderstanding easy, but difficult to detect. This means that the same story has to be told differently in different domains, but also that the writer can never detect the domain of understanding of the reader inerrantly, and so reader has to take part of the responsibility for detecting the domain mismatch between themselves and the author, and then either finding a different author or leaning the stories of the author’s domain.

      All of which is to say that the same text does not tell the same story in all domains, and that therefore reuse of a text across domains is not the right way to tell the same story to different audiences. Which is something writers have always understood, but content managers seem reluctant to acknowledge.

      • cud 2015/09/29 at 10:05 #

        Well, assuming we know exactly what a story is, then I suppose for any set of text you could say there is an exact possible set of stories. Personally I don’t think we know that much about stories. And anyway, the set of possible stories that can be pieced together from a rich body of text is bound to be huge. Further, since story depends on cultural context, and since culture constantly drifts and changes, then the set of discoverable stories would tend to drift as well.

        Drifting into drivel myself, let me pull this back… I think there is value in managing content such that you can discover units of the text that can be combined into unique stories. Yes, this is what authors have always done. Adding content management, if done correctly, just makes it easier for authors to carry on.

        The dangerous tendency is to assume that by adding in content management, some magic will “occur” that suddenly improves quality and shortens the cycle. I’ve seen lots of departments make this mistake — I think we agree on this point.

        And sadly, I really can’t agree that story comes first. I think it’s possible that we have lived so long with stories that it no longer matters which came first. But if you enter a new domain the first thing you have to do is collect data, and then see if you can piece together a story. If it’s scientific method, you piece together a hypothesis. You then design a method to collect more data (an experiment), and test that data against your story. It the story stands, you have a theory. But without any data (say, the positions of the planets), you can’t make the hypothesis. We lived for centuries with the story of a universe that revolves around our world. The data was always there — it was reliable. What was faulty was the story… And people died trying to make that point.

        And what is reliable data anyway? If data == recorded observation, then reliability is a function of our instrumentation (or senses), and our honesty in recording the observations. Talk to engineers and you’ll find it’s not easy to prove reliability of data — I’ll grant you that. But engineers are willing to settle on a reasonable tolerance and then get work done. I’ll go out on a limb and assert that it’s easier to prove reliability of data than to prove reliability of a story. If for no other reason, you have to prove reliability of data AND story in order to prove a story. I’m guessing that is one foundation of the scientific method.

        • Mark Baker 2015/09/30 at 11:38 #

          Well, so much lies in “if done correctly” doesn’t it? My point is not that content management is bad — I trust I made that clear — my point is that there is a conflict between the writerly view of “done correctly” and the database view of “done correctly” that is very difficult to resolve.

          I think we must mean something different by “data”. Sense impressions are not data. Nor does the brain resolve sense impressions into data, but into stories. Thus when we see a mirage, the brain resolves the sense impression into a story it understands: water on the road ahead. It is only after repeatedly not finding water where we thought we saw it that we realize the need to reconcile conflicting stories and go looking for data on the refraction of light.

          Nor can I agree that hypotheses stare with a bunch of data points. They start with experiences, which we interpret to ourselves as stories. When ancient astronomers observed the heavens, they discovered a story: certain stars wander through the heavens rather than staying in one place. That observation is a story, not a data point. A hypothesis is a story that attempts to explain the stories that we tell ourselves about our experiences. Data is something we purify out of stories to test the hypothesis. Having observed that the planets are wandering stars, astronomers began to measure their wanderings to attempt to explain them.

          As for proving the reliability of data, you are quite correct that no proof can ever be absolute. We have to settle for reasonable tolerance and get work done. But what is the proof of the reliability of the data? It is a story. It is a story of how the data was obtained, under what circumstances, and by what method. Data with zero reliability is not data, it is noise. Data without a story is not data, therefore, but noise. Stories, in other words, create data.

          The analysis of data can indeed lead to new hypotheses, and therefore to the formation of new stories, iteratively building up a system of knowledge. But it is a mistake to interpret this iterative process are originating in data. As the problem of demonstrating the reliability of data demonstrates, there is no data without story. The iterative process of knowledge building, therefore, begins with stories, not data.

          • cud 2015/09/30 at 12:21 #

            I think we mean something different by “data”. Our stories don’t agree. 🙂


