Transclusion is pulling content dynamically from one page into another page. Rather than cutting and pasting text from one page to another, you create a pointer to the page you are borrowing from. That pointer is resolved at run time, pulling content from the other page when your page is loaded. Transclusion was a fundamental part of Ted Nelson’s original concept of hypertext. It has never caught on, except in specific confined circumstances. Despite continued interest, it isn’t going to catch on.
Rick Yagodich has an interesting post today, On third-party transclusion, in which he discusses some of the problems inherent in one current proposal for implementing transclusion in HTML. Rick’s analysis of that proposal strikes me as sound, but I think there are two even more fundamental reasons why not just the proposal he discusses, but all proposals for implementing transclusion are doomed to fail.
Transclusion violates the fundamental nature of the Web
David Weinberger described the fundamental nature of the Web as small pieces loosely joined. That is, the Web works because all of its billions of pages are joined together loosely. If they were more tightly joined, it would be too hard to insert a new page into the Web and there would not be billions of Web pages. If they were not joined at all, you could not move through the Web, and thus those billions of Web pages would not be found or read (and therefore, again, most of them would not exist). The specific level of looseness of the connections between the page is essential to making the Web work.
But transclusion creates a tighter join between pages. When one page transcludes another, it is tightly joined to that page, and that page (whether the author knows it or not) is tightly joined to all the pages that transclude from it. Transclusion, therefore, is contrary to the nature of the Web. While Web protocols could easily be adapted to implement transclusion, the very notion of transclusion breaks the model of the Web at a much more fundamental level.
The more formal name for the practice of creating small pieces loosely joined is the software design principle of tight cohesion and loose coupling. Tight cohesion and loose coupling are essential properties of an adaptable software system (and, not coincidentally, of Web based software architectures). A software module has high cohesion if it is self contained and self sufficient. A system composed of such modules is loosely coupled if the individual modules do not depend on any knowledge of how the other modules are implemented.
Generally such architectures are implemented by having the modules communicate by passing messages to each other. You can then replace any module in the system without having to rewrite any other modules, or change the basic architecture, as long as the new module sends and receives the same message formats. This creates robust systems that are easy to change as new needs arise.
Transclusion, on the other hand, creates a content object that lacks cohesion in the most fundamental way. The implementation of the content object then relies on the internal implementation of another content object. It is therefore loosely cohesive and tightly coupled — the very antithesis of the principle that makes the Web work.
You do, of course, find transclusion in many current publishing systems, DITA being a notable example. But DITA, like most current publishing systems, is loosely cohesive and tightly coupled. This is what makes all such systems prone to be fragile, and what makes them fundamentally incompatible with each other, making exchange between systems difficult. Tight coupling is not without its advantages. It can make certain kinds of functionality — such as reuse by transclusion — easier to implement and understand. But it is fundamentally un-Web like. (Examples of publishing systems that feature tight cohesion and loose coupling are Wikis and SPFE.)
Transclusion is based on an outdated model of publishing
The idea behind transclusion is to create a new publication by borrowing content dynamically from an existing publication. This rests on the idea that publication is a permanent state: that publication creates a stable object with a stable address. This too is fundamentally un-Web-like. The Web is a dynamic media where publication is a dynamic event that comes with no guarantees of permanence either of content or place or time.
With paper publishing, the writer controls the right to make copies (copyright) but once a copy is made and sold, the copy — not the work, but the copy of the work — becomes the property of the buyer. The sale is then irrevocable. The author has no right to demand the copy back. Once published, a work belongs to the public forever, or at least as long as a legible copy remains in existence.
Electronic publishing does not work this way. Works are not so much copied as cached, and the cache of a work on your Kindle, for instance, can be removed. You buy a licence to view the content, and that license is (or certainly can be) revocable.
This is one of the fundamental objections that many have to the electronic publishing model, with some vowing for this very reason never to buy ebooks, but always to buy paper. They see the ebook model as violating the fundamental rights of the buyer. really, though, it just reflects how a different technology makes a different kind of relationship possible.
There is nothing inherently wrong with the idea of renting content, after all. Libraries do it. Video services do it. Indeed, for a great deal of the content that we consume, there is very little likelihood of our wanting to consume it again. It makes more economic sense for us to rent a one time use of the content rather than to purchase a long term or permanent right to consume it many times.
There is nothing wrong, either, with the idea of being able to unpublish something. The recent European right to be forgotten laws enact some version of this. It is a misguided version, because it grants the right for the subject of a publication to revoke it, rather than the author. But the idea that if I have published something in the past that now embarases me, I should not be allowed to take it down, makes no sense. It is equivalent to saying that once you have opened your curtains, you may never close them again. If I want to make a certain work public for a time, and them make it private again, why should I not be able to do so?
Yes, unpublishing breaks all kinds of reference and citation mechanisms that we have built up on the presumption that publishing is irrevocable, but that is not sufficient reason for denying the right to unpublish. (Spell check objects to the word unpublish — an indication that it has not been something we considered possible. It is a word we need to add to our lexicon.)
Equally important is the right to amend a publication as and when we see fit. This is so fundamental to the way the Web works, that it has changed reader’s expectations about what a publication should be. Rather than expecting a URL to permanently point to the original content that was published there, we expect it to point to the latest version of that content. We expect it to be current, not permanent.
Such an expectation, though, is fundamentally incompatible with transclusion. The act of publication provides no guarantee that the material you transclude today will be the material that shows up in your transcluding page tomorrow.
Yes, in some sense you could deliberately transclude the current state of a page element, such as when you include a live stock ticker on your page. But this is really making a call to a web service, and relying on the specific promise that that Web service makes about the content it will deliver in response to a particular query. It is not transclusion as a general mechanism for ad hoc content reuse. An ordinary Web page provides no such guarantees about what it will include in the future or what address will fetch that content.
Transclusion, as a general mechanism, therefore, will never catch on.
Web services, as a specific mechanism, providing specific promises in specific formats, have, of course, caught on big time. there is nothing more that needs to be invented here. Rather, we need to change our view of how we create, manage, and deliver content to take greater advantage of the Web services model.
Mark,
You make some interesting points about the fragility of transclusion. I just wrote a post noting some of the same issues concerning transclusion as an approach to content reuse. (I even noted a comment you made in a earlier post; you can read it here: http://storyneedle.com/four-approaches-content-reuse/)
I am puzzled by your statement that transclusion is never going to catch on. In fact, it is used everyone already. Imbedding videos or Slideshares are a form of transclusion. Wikipedia uses transclusion extensively.
I believe it’s valuable to recognize the limitations of transclusion. But I don’t believe that loose linking is necessarily a better approach in all cases. People don’t want to chase after links to other places, even it that’s what’s easiest for authors to implement.
Thanks for the comment, Michael
Of course, this is one of these cases where it depends what you mean by transclusion. And with both technology and the uses and applications of that technology advancing so rapidly the definition of terms is necessarily elastic.
The issue here is, is the web service model a form of transclusion. This is not really a technical debate, it is simply one about how we choose to use words.
A web service is deliberately providing a service with an explicit service promise. It allows you to embed content in your pages, so in that sense it is transclusion.
Yet the vision of transclusion is broader than that. Nelson intended it to be a universal mechanism allowing any content to be transcluded. In that sense, a web service is much more constrained than a general transclusion mechanism.
It is notable that web services do meet the definition of being tightly cohesive and loosely coupled. This comes with a bunch of restrictions, but it makes the use of such services more robust.
What I am arguing in in this piece is that general transclusion will not catch on, but, as I noted at the end of the post, the web service model has already caught on big time, and we would be better served to figure out how to make more use of it, rather than chasing the more general idea of transclusion.
Tim Grantham writes (he was unable to post for some reason):
“Rather than expecting a URL to permanently point to the original content that was published there, we expect it to point to the latest version of that content. We expect it to be current, not permanent.”
There is another, very common way (in addition to web services) that content gets transcluded into a web page: through RESTful APIs, which use static URLs to access resources (including content) on the remote server. If, at some point, a remote resource changes its state (including updates or even ceasing to exist), the URL still works, only the returned results are different. So how does this transclusion method violate the fundamental nature of the Web?
DITA 1.3 will add generic container elements such as <div>, which can take RESTful URLs to transclude content the same way Web pages do today. This makes DITA 1.3 no longer “tightly coupled”.
Thanks for the comment, Tim.
Again, we are getting into the definition of transclusion. If every embed is a transclusion, then yes, though personally I would class a RESTful API — or any API — as a way of implementing a Web Service, not as something different in kind. Again, usage is necessarily fluid in these areas as technology and usage develops. My point is about the ability to transclude anything, not the ability to call a content API that is created specifically to create an embeddable feed.
The issue of promise is essential here. An API makes a promise. An ordinary web page does not — or, rather, does not make any promise that is compatible with arbitrarily selecting and displaying its content on another page.
Adding <div> is an interesting development, but it will take a lot more than that to make DITA no longer tightly coupled. Nor would it necessarily be desirable to make DITA loosely coupled. A lot of things that people value in DITA today would be impossible in a loosely coupled system. However <div> may make one specific form of loose coupling easier.
It seems like comments are down, though I can post comments through the admin interface. If you have something you want to say, please use the contact form to let me know and I will post it. I’m working on trying to get comments working again.
It’s true that transcreation isn’t viable as the primary mechanism for web publishing. But to say “it’ll never catch on” is to ignore the ways in which people are already using it successfully. The DITA transcreation model, for example, enhances consistency and reduces translation costs in situations where a piece of content needs to be republished in multiple contexts or for multiple audiences.
You might as well say that the bicycle will never catch on because we have cars for fast, efficient transportation.
Thanks for the comment, Larry.
Well I’m not sure that I would equate transclusion and transcreation — terminology is slippery once again. I’m also not sure that I would call DITA’s mechanisms transclusion in the full sense, since they are essentially private, not public.
But then again, the use of the term has definitely expanded in that direction. That in itself may be evidence of transclusion not catching on — the term has been adopted to describe related things that are in more common use.
On this note, I was reading an article just today by Roy Fielding, the inventor of the REST architecture (http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven) in which he complains about how many APIs are calling themselves RESTful despite not following the most fundamental principle of REST. “I am getting frustrated by the number of people calling any HTTP-based interface a REST API,” he writes. The term REST has become more widespread than the concept. The same might be said of “transclusion” — it now seems to mean any display of content that originates from more than one source — a definition by which simply including a graphic in a web page is a form of transclusion.
Not that I am in any way attempting to mount a defence of any definition of “transclusion”. I am simply explaining my use of in this post. All language is local.
On the subject of transclusion in DITA, I think it is notable that while DITA does have a general transclusion mechanism in ConRef — that is, it can reuse anything without the reused content, or its author, or its current editor, having any idea that it is being reused — virtually every statement of DITA best practices seems to urge people not to use it this way.
Reuse must be planned carefully, they say, and many advise never that you only transclude out of content deliberately designed to be reused, and that such content should be stored separately. In other words, you should reuse out of a pool of reusable content elements that are not themselves part of the published content.
This strikes me as excellent advice, but also as essentially a rejection of the principle of transclusion in favor of a managed service approach. The mechanism may still be transclusion, but the principle is closer to that of a Web service.
Pingback: Adding code comments through a sliding jQuery Sidr panel (DITA) | I'd Rather Be Writing
I really agree with you, Mark. (Relish the moment–it may be the only time I ever say it!)
SGML formalized a way to define replacement strings for the parse-and-replace feature called ‘general entities.’ Major publishing applications sadly used this misguided technique for years in response to user requests for a way to pass run time values into massive publishing jobs, and the trend was carried on in XML by standard XPath queries (I looked for Web Services but this all comes with the standard on all platforms-damn them all).
Looking back on it all now, it’s uncanny how much general entities and conrefs resemble SQL queries that get placed into templates for billions of mail and online orders, bills, mass mailings and purchases every day. It’s almost as if the use of indirection for content replacement had a real business value. But of course this is all a house of cards; transclusion has yet to catch on, at least not in Kitchener, Canada.
Take our advice–use plain words and don’t be afraid of a little cut and paste now and then. If your product name ever changes, that’s what Programmers invented Search and Replace for. Don’t say Yes to CMS! Get out your grep and get to work. Regex was around long before Spacex–use it to find all the possible variants in how you or others spelled that product name. It’s amazingly powerful, and your reviewers won’t mind getting to re-read your prose end to end once again. Don’t worry about translation because once it is out of your hands, paying for repetitively redundant repetitions is not your problem. Remember, fully instantiated content is the smartest content you can have–Wikipedia just can’t be wrong. Throw off the shackles of XML and come on over… our motto is Work harder, not Smarter.
Whew. Thanks, Mark… I always knew I would see it your way someday.
Thanks for the comment, Don.
Refreshing to see such sustained sarcasm. Dissing Wikipedia is a nice added touch.
A couple of problems, though.
First, the things that you list did not catch on in any general way. They are used by specialists in closed systems. Most content creators don’t do these things.
Second, none of them are transclusion. They are inclusion. And of course inclusion has caught on. Every time you embed a graphic in a web page, you are doing inclusion.
The “trans” part of transclusion implies reuse across domains.
As I noted in the post, web services (to use the term in a broad sense) do provide a kind of inclusion across domains, but it is a the inclusion of something explicitly created to be included with a specific interface and a specific promise about what that interface will deliver.
Transclusion, of the kind that Rick was discussing in the post to which I referred, and on which I was commenting, is something else: the inclusion across domains of material not explicitly offered for inclusion, but mere addressable. Conref across domains, in other words.
That has not caught on, and it isn’t going to.
Now, you could object (even without sarcasm) that the term transclusion has come to have a much broader meaning, that it has become a sexier term for plain old inclusion. Indeed, it has. Conref in DITA is not inclusion across domains, but people still like to call it transclusion.
But, as I noted above, insofar as transclusion has simply become the word inclusion with a suit and tie on, it demonstrates that transclusion has not caught on.
I used “replace-” in some stemmed form 3 times and “inclusion” none; your reply lights up like a Christmas tree when I find “inclusion” in the article. We’ll get to that.
So SGML/XML are a different beast than the Web at large. Content created solely as a Web resource depends on the Web as its content management system, and the RESTful verbs–Create/Read/Update/Replace–are the CMS tools by which that content is managed, with HTTP and its response codes as the state engine that handles transitions between requests and responses. SGML and XML do not live on the Web (for the most part, but the exceptions are not the rule), therefore their states and transitions by definition are not handled by HTTP but by applications (of which a Web Service is one sense of use; editors are another). Whether that schema-controlled content lives in a file system or a database, other tools necessarily provide the CRUD management principles for it. It is a nod to the general nature of this kind of managed data that it can be published both outside of the Web as well as to the Web. Any non-Web-native content (including Markdown) that is converted to HTML and published to the Web may be published outside the Web because it has none of the baggage that typically marries Web-as-CMS content to the Web.
Given that applications provide the content management services for off-Web content, they tend to mimic the Web in their tools: the ability to create, update, delete or browse content in the storage system; search to find items of interest, content curation (selection for specific uses or workflow routing), and even transclusion in the broad sense of the word. I used it because it is not wrong, but rather you have danced around a pedantic distinction while missing my point:
The SGML/XML mechanism is NOT inclusion, it is replacement. Charles Goldfarb and the original XML coalition are clear on this distinction. An entity reference is a proper part of the syntax of the document. When it is parsed, its scope is replaced by the thing it references. This is not done by an application–it is built into the parser. For an XML editor to read in a document and not automatically replace all the character and entity references with their target content is because the editor behaves as an application on the CMS side of things, not on the publishing side. A good editor will let you see that content in either mode, but the editor application must preserve the boundaries; the publishing application abolishes them.
The Web architecture has a limited entity replacement feature (apos, semi, lt, gt, ampersand) and no declaration mechanism to indicate any other replacements. By this omission, any attempt to enable indirect content referencing on the Web is an application hack, not a feature of the stack. Not unsurprisingly, many of those hacks look like programming inclusion syntax, mainly via server-side parsing of comments or programming language processing instructions.
Conref is not a hat trick of XSLT but is rather a direct capability of the XML processing architecture itself. It can be applied for any XML vocabulary, not just DITA. My first proof implementation of conref was based on a TEI discussion that talked about “XML snippets” in fact–one element being replaced by another element simply by switching the context, as it were. An unfulfilled conref, in fact, is still valid XML in the original vocabulary, so in that sense there are no “broken links” in the DITA implementation. The funny thing is that you can do this with any XHTML on the Web, but since no one likes validation on the Web, we’d rather use breakable and site-specific inclusion hacks.
So you can’t call XML content replacement “inclusion”–it’s all the difference between a designed behavior within a system and a hack from outside. I think Rick’s use of “inclusion” probably should have been “replacement” for that matter, but that’s a different blog and a slightly different argument. Is conref the same as transclusion? If you want to define some distinctions about types of transclusion, that’s fine by me–few Web techs are doing it, none are doing it right, and Ted Nelson changed his divine vision several times anyway, so that’s all pedantic to me. XML-based content replacement IS happening, it is transclusion by any general sense of the term, and my sarcastic point was that it is a pervasive and reliable mechanism–for content whose management system is behind the Web, not on it.
It is perhaps on that point that we should keep our messages distinct as well–EPPO speaks to the nature of the HTTP-based Web, but cannot work generally on the CMS- (or even file-) managed side of things due to the more fluid requirements for workflow management there. There is a middle ground, but it is small indeed, and you and I are working on different sides of it, I suspect.
“Replace” in my CRUD definition should be “Delete” to complete the acronym, and the actual RESTful verbs are more like POST/GET/PUT/DELETE, but the general services are the same. I had a brain replacement, evidently. It was not an inclusion.