tl;dr: We can apply engineering methods to content development, but we do not have the body of proven algorithms or known-good data to justify formal certification of communication professionals they way we have for doctors and engineers.
We talk about content engineering. I call myself a content engineer sometimes. But can content really be engineered? Is content engineering engineering in the same way that engineering a bridge is engineering, or only engineering by analogy?
This post is prompted by a fascinating conversation with Rob Hanna and others at the monthly STC Toronto Networking Lunch. The conversation morphed into something I think I can fairly characterize as: is there a uniform methodology to technical communication, one that can form the basis of a curriculum, a certification, or a toolset, or is there a legitimate diversity of approaches, roles, methods, and tools.
Let’s clear one thing up first: When people talk about content engineering what they often mean is engineering publishing and content management. There is no doubt in my mind that publishing and content management can be engineered. The question is, can they be engineered without doing violence to content quality and burdening authors with tasks and requirements they should not have to bear? Ultimately I think the answer to that question is yes, though I don’t think many systems achieve that today. But actually engineering content — that is, making content itself better through an engineered process — is something different.
Engineering and imagination
We could make a very broad brush division of human activity into works of engineering and works of imagination. Works of engineering would constitute all activities based on proven algorithms and known good data, and works of imagination (whatever imagination may be) would constitute all the rest.
But we can immediately see that this is a false dichotomy. The ideal engineering solution certainly involves proven algorithms and known good data, but these are hard to come by. And the way we get to them is not by a pure act of imagination, like Minerva springing from the head of Zeus. Instead, we use imperfectly proven algorithms running hoped-good data, and we study the results with the aim of systematically improving the quality of our algorithms and our data.
And that, indeed, would be a far better description of what engineering is: the process of systematically improving our algorithms and data. And that process itself is carried out in an engineering fashion, using the best available algorithms and the best available data to validate our algorithms and our data.
Which is not to say that imagination is absent from engineering, because the process of improvement requires some input and impetus which must come from some place other than the algorithms and data that we already have: imagination. In addition, each engineering problem has unique aspects to it for which we have no certain data or algorithm and which must be addressed by imagination, informed, of course, by the best data and algorithms that we do have. (But discerning how these apply to a unique situation is a matter for the imagination.)
Certifying proven practices
As certainty in our algorithms and our data grows, we can begin to codify knowledge and practices, and then to educate, certify, and license practitioners based on their grasp of this knowledge and these practices. This is the kind of education, certification, and licensing that we require of civil engineers and doctors. But we don’t require it of software developers, by and large, or of technical writers.
Is this mere omission, crying out to be rectified, or are there valid reasons to require certification and licensing in some fields and not in others?
There are commercial considerations, of course. Certification and licensing are attractive to some people employed in a field because they hope that erecting barriers to entry will raise salaries and improve job prospects for people with the certification. Others in these fields oppose certification because they do not want to be burdened with certification requirements or bound to standards of practice that they may not agree with. This is very much the case with the debate over certification in technical communication today.
When is certification in the public interest?
But certification merely to raise salaries of practitioners is not in the public interest as it merely raises prices without raising quality or improving safety. The social justification for standardization of practice and certification and licensing of practitioners is that it improves outcomes sufficiently to compensate for the raised costs. High costs for engineers and doctors is a price society is willing to pay for buildings that don’t fall down and surgeries that don’t kill more people than they cure.
To be socially justified, standardization of practice and certification and licensing of practitioners in technical communication, or the content professions generally, would have to show a verifiable and consistent superiority of a given method compared to all others. And the simple fact of the matter is that there is no current method that can do that.
Algorithms and data
It comes down to algorithms and data. Engineering is a matter of continually refining algorithms and data through a process that is itself based on algorithms and data. At a certain stage in this process, it is better to let a thousand flowers bloom. We don’t yet have a sufficiently well defined algorithm or sufficiently good data to select one algorithm or one data set either as the objectively most correct or the pragmatically most likely to produce the best results over time if we abandon all the rest and focus everybody on refining this one. Until we get to that point, prematurely cutting off the other flowers is more likely to doom us to an inferior path than it is to foster greater progress.
As things progress, certain paths may prove to be dead ends. At a certain point a few paths, or one path, will be so strongly supported by evidence that there is a very clear social benefit in standardizing on that path and in training and certifying practitioners who follow that path. We have very clearly reached that point in civil engineering and medicine.
This is not to say that there are no outliers in these fields. There are doctors and engineers who have been trained and certified and have then chosen to examine other paths. There are people who are not trained or certified but have struck out on their own. Most of these efforts will come to nothing, of course, but sometimes they can add something new. Very very occasionally, they can spark a revolution and put an entire profession on a new track.
Two models of a profession
This suggests two fundamental models for professions, based on where they are in this process of refinement. For those whose algorithms and data are sufficiently refined and proven, a broad certified core of practitioners and a small cadre of outliers returns the greatest social value and the greatest promise of continued growth. For those whose algorithms and data are less certain, though, greater social value and the greatest promise of continued growth will come from letting a thousand flowers bloom. Premature selection and concentration can only do harm. Premature certification is a social evil that limits growth.
Technical communication, and the content arts generally, seem to me to be very firmly in the second camp. They are not alone. Software development methodologies and business process improvement is also in this camp, with new theories and approaches been developed and tried on a regular basis. (Check out the debate among tech entrepreneurs about the value of MBA’s.)
It may well be that certain professions will graduate from the second camp to the first (as medicine and civil engineering did in the past) while others never will. I expect that technical communication and the content arts generally will never graduate, and I will explain why in a minute. But whether we expect to graduate or not, until we do, we still need to let a thousand flowers bloom and not try to prematurely close off alternative approaches until we can prove our algorithms and our data with much greater certainty than we do now.
Engineering and a thousand flowers
Letting a thousand flowers bloom does not mean abandoning engineering or engineering methods. It does not mean that we should stop developing algorithms, gathering data, and rigorously testing the results. These are the flowers. If we stop doing these things, we don’t have a thousand flowers, we have none. Not every method we try will work. Not every method that works for one situation will work for the next. Not every method that one team uses successfully will be used successfully by the next. But on balance, applying engineering techniques will make our individual projects better, even if they are not provably universal in application or globally superior in outcomes.
And of course, individual organizations cannot afford to let a thousand flowers bloom internally. There is an overhead to each flower that requires that the number of approaches within an organization be limited. This does not mean they can or should be limited to one, however. The fact is, few engineered approaches to content are universal in scope, and the more universal they attempt to become, the more overhead they tend to have. Letting different content groups with different aims use their own tools and processes is often the lesser of two evils compared to trying to centralize all content production in one complex training-heavy system.
But the issue here is not whether a thousand flowers should bloom in one organization. It is whether a thousand flowers should bloom across the content producing arts. And the fact here is that not only should they, but they certainly will, for there is no method or tool that has achieved anything resembling universal acceptance. Nor has any one of them shown itself to be universal in scope. Different tools and techniques are better at different things.
Certification in tech comm is often compared to certification in project management, where the CPMP certification has received widespread industry acceptance. I don’t know enough about project management to judge if it has reached the level of engineering knowledge of civil engineering or medicine, but it is clear that project management is a much narrower field than content creation and that, crucially, it is well positioned to gather useful performance data against which to validate its methods.
Validating a body of knowledge
A profession has a body of knowledge. A body of knowledge consists, essentially, of its proven algorithms and known-good data. A collection of books and articles, however large or well regarded, is not a body of knowledge without proven algorithms and known-good data.
Trying to define a professional body of knowledge, or to certify practitioners based on that body of knowledge, has no societal value until your ability to validate it reaches the tipping point where the social value of standardization outweighs the social value of letting a thousand flowers bloom. None of the content arts is at this point. I doubt they ever will be.
The triumph of tacit knowledge
Here’s why: In the communication arts, some of the greatest practitioners in all fields have been people with no formal training or method. Some of the best novels are written by people who never took a creative writing course. Some of the best manuals and technical books were written by people who had never heard of technical writing, let alone taken a course in it, nor read a book on cognitive psychology, or even cracked a style guide. All of these things may make some people better, but it is still possible to practice that art at a high level without any formal knowledge, based on tacit knowledge acquired over years of teaching, reading, writing, and practicing the subject matter you write about.
The limits of tacit knowledge
This same kind of native practice is just not possible in civil engineering or in medicine. Most of us can build very basic structures and treat simple cuts and colds without formal training, but we can’t build great buildings or cure serious ailments without significant formal knowledge of the properties of material and structures or the operations of the body and the effect of drugs on the body. Such knowledge cannot be acquired tacitly. It requires careful measurement, exact recording, and formal methods of extrapolation and application. In these professions there is a demonstrable and impenetrable ceiling which no one can pass without formal study, and beyond which no one should be allowed to venture without professional certification.
This simply is not true for content. Whatever benefits content engineering may bring (and it brings many) it is not a prerequisite for peak performance. It may enable many people to perform far better than they otherwise would, but some people can and do produce great works without formal study or knowledge or method.
It may be that all the great works of communication of the past merely represent the ceiling of what tacit knowledge can achieve and that in the future codified communication methods will produce far greater works. But getting there requires proven algorithms and know-good data, and that is tough.
The difficulty of proving communication algorithms
The biggest problem for formalizing methods in communication is that the algorithms are fiendishly hard to define and validate, and the data is fiendishly hard to gather, validate, and interpret. Unless we can make progress on these fronts, we can’t move closer to the point at which we can choose which of a thousand flowers will become our single tree.
Why is it so hard? First, the algorithm for communicating is very hard to define. What facts does one present in what order and with which words to convey an idea to an individual reader? It is impossible to pin down because every reader has a different background, and a different goal in mind as they read. And even the same reader coming back to the content a week later will be different because of what they have learned and forgotten in that week, and how their task has changed since the last time they read.
The hardness of concrete of a given formulation does not change from one building to another or one day to the next. The effect of a drug on a body system may vary within a range between individuals, but only within a range, and probably not from one day to the next.
And those properties are easier to measure. Does a particular block of concrete bear a particular load? You can measure that. Does a particular drug effect a particular cure? You can measure that. Does a particular article enable all users to complete a particular task? You will have a very hard time measuring that. And if you could, what would you be measuring? What quality of the article produced the result you measured?
Not only is it that much harder to measure, but there are many more users and many more tasks than there types of building material or drugs and organs. Indeed, the variety of users and tasks is so great that it is hard to get enough data points to validate an algorithm. A/B testing can tell you that one formulation works better than another for a particular application (if you have a way to measure successful task completion, which you usually don’t), but that does not prove either choice best, nor does it generalize the algorithm. Worse, in some cases what worked in the past may cease working in the future as the audience changes. (This is particularly true of content marketing, where people quickly get wise to the latest gimmick.) You can certainly generalize and recognize patterns in what works and what does not, but such generalizations are inherently less certain than a specific load bearing number for a specific building material.
Secondly, it is very hard to measure the actual effect of content. For technical content the measure is, did the user act correctly. But in the real world, not only are we seldom in a position to measure if the act was correct or not, we are seldom in a position to even know what the intended act was. We can set up artificial lab conditions in which we define the act, but while this may tell us something useful, there are all kinds of uncertainties involved in this kind of measurement.
Third, it is hard to measure the effect of a content production rule or procedure on the writer who produces the content. No matter how much structure we wrap around the process, in the end a writer is a storyteller. How do we measure whether the structure and rules we enforced led them to tell the optimal story? And when they fail to tell the right story, how do we detect that they have failed or why they have failed?
If we can’t measure these things consistently and accurately, we can’t refine the algorithm. And in many cases, we simply can’t. We can observe that all sorts of things worked better under certain local conditions, but understanding exactly why and formulating it as an algorithm that will ensure equal success in all other circumstances simply eludes us.
Good actionable knowledge
This is not to say that there isn’t good actionable knowledge about technical communication, content strategy, and the content arts. There is lots of it. I hope I have been able to make a small contribution to it. The thousand flowers are growing and being well tended. Its simply that we don’t have the data to prove any one of those flowers to be the one true tree (if, indeed, there actually is one true tree).
Certification in individual methods
This does not mean that there is no role for standardization and certification, or even licensing. Any one of the flowers is an engineering method and you can standardize a method and certify people as practitioners in the current state of that method. For instance, Agile is one of several competing software development methodologies. Scrum is one of several competing methods for organizing work in an Agile environment. You can be certified as a scrum master, demonstrating that you know how to do scrum in the present state of the art.
But what you can’t reasonably do is to wrap up a bundle of a dozen of the more popular flowers at the moment, wrap a certification around it, and call it a certification for the profession as a whole. You can reasonably say that you need to be a certified scrum master to be a scrum master, but not that you need to be a certified scrum master to be a software developer. This has nothing to do with whether such a certification validly measures the current state of the art for a professional technique. It is simply a matter of not being able to demonstrate that the techniques in your bundle are objectively better than those outside of it.
Why bother trying to engineer content then?
If imagination still plays so large a role in content tasks, and if engineering methods are so hard to prove, and therefore to advance, why bother with content engineering at all?
For one thing, even if engineering methods could tell us nothing at all about effectiveness, they can still be used to improve the consistency of what imagination produces. If imagination determines that a certain pattern is the correct way to write a certain topic, then engineering can help ensure that that pattern is followed consistently.
Secondly, until we can achieve a measure of consistency in how we execute the work of imagination, we will not be able to usefully measure anything, and therefore to assess the value of any algorithm that our imaginations may devise.
The value of patterns
At this stage of development (which may be our permanent state) I believe patterns are tremendously important. A pattern is far from having the formality and rigor of an algorithm, but it can do a lot to improve quality and completeness and can form the basis of a repeatable measurement.
Hollywood does not have a particularly high success rate in developing new works of imagination. When it does get a hit, it tries to reproduce the pattern as many ways as it can until all appetite for the pattern has been wrung out of the market place. Not all attempts to follow the pattern are equally successful, but it is very clear that attempting to define and follow the patterns of hit shows has a better success rate than trying to launch entirely new patterns.
This does not mean that we have found the story algorithm, the algorithm that would ensure that every movie, book, or TV show will be a hit, but what we have seen is that in a field in which a thousand flowers bloom, it is possible to identify and reproduce the better flowers, at least for a while. This is an absolutely valid application of content engineering, even if it does not produce, and may never produce, the grand unifying story algorithm.
Yes, we can apply engineering methods to the improvement of content. But we are not at the point, and may never reach the point, of having a rigorously proven set of algorithms and data the can form the basis of a socially justifiable certification of technical writers or other communication professionals.