Cohesion Without Coherence: Artificial Intelligence and Narrative Form
Comment on: “I was received by the city as I stepped into the world again”
TRANSIT vol. 14, no. 2
by Hannes Bajohr
translated by Kayla van Kooten
Download PDF
Click here for translator’s introduction
Click here for “I was received by the city as I stepped into the world again”
I.
The boundaries of the arts do not only exist between them, but also within these arts themselves. Therefore, the following will focus on the boundary between narration and its other—that is, between narrative and non-narrative forms. What interests me here is something that I would like to call surface narration: the mere form of storytelling. This seems to me to be a characteristic of texts created with AI.
There is every indication that AI-generated text will become an everyday phenomenon for recipients in the near future. It is assumed that such scriptural automation will be easier to perform for some genres than for others—especially if their artificial nature is not in question. Today, certain types of texts—from weather reports to product descriptions—are already underdetermined in a way that makes their origin seem negligible. Nevertheless, one usually presumes that they are written by humans (at least if one is willing to think about this question), because the technology for their sufficient automation has not yet been fully developed. This is currently changing, and we may soon get used to attributing the origin of certain written text to an AI. Especially with “unmarked” texts like the ones mentioned, the question of origin—whether a text is “natural” or “artificial”—may eventually become so unimportant that it will not even arise anymore. We would then be dealing with post-artificial texts.2
This is not only due to the fact that they are functional texts, but also to their specific structure and construction. This includes the relative absence of style, as in the data-driven weather report, or the pastiche-like predictability of the hyped-up rhetoric of marketing language, which, as a function with unsaturated arguments, can be used to promote a wide range of different products. However, much more important is the fact that these genres are “small” in a way that they achieve a certain balance in the relationship between coherence and cohesion.
If cohesion refers to the way in which text elements are linked at the phonological, orthographic, and lexico-grammatical levels, coherence refers to their abstract meaning context. The separation is ideal-typical, so that in any concrete work there is always a mixture of cohesion and coherence, as Holger Schulze explains: “Pure coherence—as it would exist in immaterial, pure ideas—is as unthinkable as pure cohesion—as a completely context-free operation with content. Every concrete artifact consists of coherence patterns that are mediated by cohesion—and cohesion patterns that become recognizable only through coherence. […] Coherence and cohesion are inseparably linked, so that only a certain dominance of coherence or cohesion patterns can be determined.”3
In unmarked genres, I believe, we tend to have a roughly equal distribution of coherence and cohesion, which leads to both being more or less unmarked. While there are the differences in style I mentioned earlier, in such cases we are neither dealing with a radical cohesion text, nor with a radical coherence text. The former was the subject of investigation for modernist avant-gardes—think of Henri Chopin’s sound poetry—whereas the latter would lean more towards formalized arguments such as those found in analytical philosophy. A particularly extreme example would be Richard Montague’s version of categorial grammar, which begins with the assertion “I reject the contention that an important theoretical difference exists between formal and natural languages” and then goes on to exercise it in such formalization cascades as “<∂, φ, ψ> ∈ R4 → ψ ∈ C1” for the disappointingly simple sentence “John loves Jane.”4
Insofar as modern natural language processing (NLP) is at least in the tradition of such formalization attempts (if no longer in their technical implementation, which, since deep learning, no longer uses such rule-based transformations but relies on the “distributional hypothesis”5 of statistical signal processing), the hope that coherence will arise naturally through cohesion runs parallel to the idea that semantics can be conjured up solely through syntax. While the latter case is not so implausible—I address it elsewhere under the heading of “dumb meaning”6—the question of substantive coherence naturally poses problems for text-generating AI: where coherence is subordinated to cohesion, the meaning of the linkage is always only a secondary effect of the rule that organizes its elements.
Besides the equal distribution of cohesion and coherence in unmarked texts, as well as the extremes of pure sonority and mere logical linkage, there is something that reflects coherence on the level of cohesion and which is not insignificant in our cultural tradition of marked, namely literary texts. Montague’s operator “→”—the material implication, the “if S, then P”—is elevated in these texts to the principle of organizing the material, namely as “therefore.” In literary texts, “therefore” is not only a logical consequence but also a causal one, and it becomes the biggest cohesion guarantee: it organizes their elements as a narrative. Narrative always asks the question of a meaningful sequence of events in space and time. And this meaning is usually conveyed not by the sequence “and… and… and…,” which is merely an aggregated conjunction,7 but by the sequence “therefore… therefore… therefore…,” which is to be understood as causal in the widest sense.
The reason for this is: Correlation does not imply causation.8 The truism of all empirical research also applies to the limitations of narration in NLP. Because, as computer scientist Judea Pearl tirelessly emphasizes, causality is not AI’s forte—neither of the old symbolic nor of the new subsymbolic or data-based AI.9 Computers can only process correlations, which say nothing about the causal coherence of their linkage. For this reason, literary scholar Angus Fletcher has expressed the belief that narration is impossible for a computer because it can only process reversible equations such as A = B and B = A, instead of causal and irreversible relationships of the form A → B.10 The computer could only correlate the data “fire” and “smoke,” so that “smoke, therefore fire” would appear just as plausible to the system as “fire, therefore smoke.” Thus, correlation is not only without cause, but also without time. Narration, on the other hand, converges in both and can therefore only be thought of causally.
There are several objections to this interpretation, not least that it absolutely reduces literature to the function of narration and questionably equates mere temporal sequences with cause and effect consequences. Hume had already raised an objection against the aforementioned interpretation when he fundamentally questioned causality as a “necessary connexion”11 (whereas Kant postulated it as the “analogy of experience” indispensable for the inner structure of knowledge).12 But despite everything, Fletcher’s thesis is interesting insofar as it can become the starting point for empirical experiments: if causes and effects as narrative engines are fundamentally unteachable to AI, one can still observe what productive errors arise in the attempt to do so.13
Instead of evenly distributing coherence and cohesion or projecting one onto the other, AI texts would strive to simulate coherence by way of cohesion; the result would be correlation texts in which causality only appears as a surface effect and the narration of “therefore” emerges as an almost, but never fully successful, chaining of “ands.” Analogous to “dumb meaning,” there would then be something like “dumb narrative.” Coherence effects are certainly partly due to attributions from reception side—we want to establish a connection between the facts that are expressed in two sentences14—but the fact that this offer can be accepted in the first place presupposes a certain predisposition of the output. This effect is what I call surface narration.
II.
Kieferling (jawling) and Teichenkopf (pondhead) emerged during the generation of such a surface narrative. For this purpose, I used the language model GPT-J, which was created as an open-source alternative to GPT-3 and is relatively small with 6 billion parameters, but can be “fine-tuned” for specific corpora. Unlike training a language model from scratch, fine-tuning refers to the practice of adjusting an already pre-trained model—which already “speaks” German—to the stylistic and content-specific peculiarities of a particular corpus. It would be difficult, for example, to get GPT-J to output grammatically correct sentences based solely on the writings of Georg Büchner. Since language AIs model statistical dependencies over word distributions, Büchner’s work would simply be too small to reach the critical mass required for sufficiently good output text. On the other hand, if you already use a model that has learned from a wide selection of German documents and has examples of syntactic correctness and semantic regularities, it can be steered in a certain, merely stylistic direction; essentially, a certain “voice” is placed on the foundation of German language competence.
In this case, I pursued a voice based on four contemporary novels that Elias Kreuzmair examined in an essay titled “Die Zukunft der Gegenwart (Berlin, Miami)”: Berit Glanz’s “Pixeltänzer” (2018), Joshua Groß’s “Flexen in Miami” (2020), Julia Zange’s “Realitätsgewitter” (2016), and Juan S. Guse’s “Miami Punk” (2019).15 According to Kreuzmair, these works exemplify a form of writing that is not generative—that is, not produced by classical codes or current AI models’ neural networks—but still able to describe a social situation as “literature of the ‘digital society'” in which the digital has become commonplace. In other words, these texts are “conventional” novels; although they use postmodern self-reflexivity to consider their own standpoint in many respects, they are unquestionably classically narrative literature. As such, they seemed particularly attractive to me for AI training. Because, in terms of both their subject matter and their structure, they tell the story of the digital, I was interested in what remains of both aspects when they are taken as a dataset for a language model, specifically GPT-J.
Already the first text introduced jawling and pondhead. This casting is not yet a narrative, rather an example of the remarkable propensity for neologisms of large language models, which are quickly able to learn morphological regularities—such as the suffix “-ling” or the fact that German forms compound words. At the same time, it says “das Teichenkopf,” not “der,” which interestingly indicates that the model has not yet absorbed the rule that the gender of a compound word is determined by the gender of its “head”—deep learning begins without a fixed set of rules, but extracts the “rules” from the probability distribution of the data itself.16 But even in the first sentence, we already have characters who, along with the first-person narrator, form the basis of the narration and both ground and advance it.
The title alone—”I was received by the city as I stepped into the world again”—implies a temporal connection, which is not yet realized causally here, but already has narrative character. It appears almost like the announcement of a Heimkehrergeschichte (story of return), which has characterized German post-war literature. Looking at just the first paragraph, one also notices the contemplative first-person narrative voice: The reflection seems to take the memory of jawling and pondhead as a starting point for a recollection from childhood, introducing a mystery about a fourth character as the actual core of the story: “what had happened to my father.”
The greatest fascination of this text—at least for me—comes from these characters. The jawling appears to me to be a kind of Odradek, that is, despite its description, it is hard to visualize, somehow anthropomorphic or a kind of subject. In Kafka, it is at least said that Odradek looks like “a flat star-shaped spool for thread,”17 while the description of the jawling simply says that it is “rooted in strength,” which is only slightly vaguer than the description of the pondhead, which is “firmly anchored by an extremely slim base”; that there is an “unconventional division” between the two, indicated by the narrator’s teeth, does not make things any clearer.
However, like Odradek, jawling and pondhead are not just objects, but quasi-persons, animistic talismans or magical creatures that exist on the border between life and non-life. Holding the jawling or the pondhead in one’s hand—it is only clear towards the middle of the text that the narrator probably has both of them with him—evokes feelings of affection, beauty, and security, which are again undermined by the fact that the jawling has the task of “taking down the pondhead.” The role of jawling and pondhead in the relationship between the father and the narrator figure remains as enigmatic as the relationship between both is indeterminate—at times it seems tense when the father takes away the jawling at times it expresses affection when the paternal kiss initiates the journey to the outlook over the city.
Perhaps characters are already close to storytelling by virtue of their appearance as actors. From the beginning, the introduction of the characters is so closely linked to the organization of the temporal levels, such as the analepsis of childhood and the prolepsis of the father’s story in the first paragraph. So much so that it is initially not noticeable how the temporal structure of the text is constantly and casually shifting. The phrase “in such times” in the third paragraph clearly indicates a different past than the one from the narrated childhood, and the “while” it takes “to reach the outskirts of this city” is again different from the pivotal moment when, only after about halfway through the text, the concrete deployment of the story takes place, identifying it in time as a “morning in the first week of June” and locating it in the father’s house.
Perhaps it is not only the striking characters but also the many remarkable twists that distract from these inconsistencies. They obviously come from a prose diction, which may mitigate the pressure for narrative coherence solely through the display of literariness. Some sentences are constructed almost virtuosic, ranging from the laconic discipline expressed in the statement, “Life in the city is not the same as life in the region,” to the excess of the lengthy hypotaxis in the first paragraph: “I suspected that the jawling’s legs had bitten into the indifferent pondhead, expressing a future of community by this immediate gesture, not just in the sense of possibility—since there were already many people there—but also through the deployment of all the other means: through the rasterizations of its own layers, through an uncomplicated combination of bowels and gutters of sweat, through the counter-condensation from hundreds of valuable experiences, through the reproduction of progress.”
From the path “that leads a human body shape towards suicide,” to the description “The city under us lay frighteningly quiet on aggravated feet,” to the point where “the feeling in the shoes had somewhat solidified,” these formulations could most likely compete with the more complex examples of contemporary literature. The cohesion is high, which may enhance the impression of narrative coherence even more, as the result invites the projection of coherence. (At this point, one can begin to ask about the origin of these turns of phrase, whether, for example, the “future of community” comes from Guse, or the “complicated combination of intestinal and sweat ducts” from Glanz—only to doubt immediately whether this “origin” still makes sense when all four novels are only present as a statistical dependency in the latent space of a single language model.)
Nevertheless, these formulations are islands of meaning that are more convincing when they remain self-contained than as elements of a narrative chain. For this reason, something like a narrative flow only occurs periodically, at least according to the schema of conventionally narrated literature. The coherence markers of character constellation and temporal arrangement are opposed by a centrifugal movement, which makes it difficult to say what the story is that is being told here. Father and child meet, view the city—which one?—from an elevated position only accessible by stairs, while the two mysterious “heads” play an important but indeterminate role. Why this undertaking? The father “wanted to show me my admiration,” “Only so that you can see the people in your happiness”—the swapping of action and being acted upon that shifts one’s own characteristics and feelings onto others frustrates the answer to this question. Such confusions are as much a narrative glitch as the irritation that arises in the reading flow when the Jawling, which is firmly in the father’s hand, suddenly appears “silk-like, […] in my [the narrator’s] fist.” When the hike ends with the recapitulating sentence, “So our excursion was only a small part of developed feelings, a small part of the world, how I knew it from my heads,” the impression of surface, of the shape of a story without interrogable substance, is confirmed despite all narrative effects.
At the same time, I hesitate when I make this judgment. If I didn’t know how the text came about, would I come to the same conclusion? After all, every text is a surface for those who read it. As long as cohesion is present, coherence is a question of the tolerance for its absence, and perhaps the difference between correlation and causality is not really important when only the reception decides what connection should exist between two events. Moreover, if a text seems to correspond to literary diction, the need for such a connection may further diminish. Likewise, recurring characters—even if their actions are erratic and their motivations mysterious—may already have a binding function in themselves: jawling and pondhead would possibly be inventions that already in their own right tell a story, regardless of whether it is actually told; their appearance introduces a potential that does not have to be executed in order to unfold on the reception side. And in the end, it is only this reception side that matters, when it is no longer clear how and in what way a text came into being, whether through AI or humans—when it has become post-artificial.
1 Kreuzmair, Elias. “Die Zukunft der Gegenwart (Berlin, Miami).” Digitale Literatur II, edited by Hannes Bajohr and Annette Gilbert, Edition Text + Kritik, 2021, pp. 35-46.
2 I discuss the difference between natural, artificial, and post-artificial texts in: Bajohr, Hannes. “On Artificial and Post-Artificial Texts: Machine Learning and the Reader’s Expectations of Literary and Non-Literary Writing.” Poetics Today 45, no. 2, 2024, 331–61.
3 Schulze, Holger. Das aleatorische Spiel, Fink, 2000, pp. 23.
4 Montague, Richard. “English as a Formal Language,” in: Richard Montague, Formal Philosophy. Selected Papers of Richard Montague, ed. by R. H. Thomason, Yale University Press, 1974, pp. 188-221, here p. 196. The example is only partially appropriate in that, of course, the “ideas” are still conveyed through the cohesion of logical connections. Nevertheless, it seems to me to come closer to the idea of “pure” communication of thought than natural language. Montague’s goal is to formalize the relationship between the meaning of sentences and their syntactic structure, i.e., to derive the meaning of a complex sentence from the meanings of its parts and the way they are linked together. Since the meaning of the sentence is read from the relationship between its components in their logical structure, and not from the mere sequence of words or the grammatical correctness of the sentence, as would be the case in a cohesion-heavy example, I therefore assume a tendency towards coherence here.
5 Harris, Zellig S. “Distributional Structure,” Word 10, No. 2-3, 1954, pp. 146-162.
6 Bajohr, Hannes. “Dumb Meaning: Machine Learning and Artificial Semantics,” IMAGE 37, no. 1, 2023, pp. 58–70.
7 Precisely for this reason, the scheme of conjunction has been recommended as an alternative to classical narration—prominently in the interpretation of Deleuze and Guattari, who oppose the tree to the rhizome and, accordingly, the root-book to the book as a war machine, see Deleuze, Gilles and Félix Guattari. A Thousand Plateaus, translated by Brian Massumi, University of Minnesota Press, 1987, pp. 5. There, the positive invocation of conjunction can also be found: “The tree imposes the verb ‘to be,’ but the fabric of the rhizome is the conjunction, ‘and… and… and…’ This conjunction carries enough force to shake and uproot the verb ‘to be.’’ Ibid., p. 25. That ‘to be’ implies causality is not stated here, but seems plausible. In practice, the nouveau roman in particular made the absence of causality as a dissolution of narrativity its structural feature, compare Kermode, Frank (1967): The Sense of an Ending. Studies in the Theory of Fiction, Oxford University Press, pp. 19. Additionally helpful as an illustration of current popular writing practice: Juan S. Guse pointed me to a writing workshop on YouTube by Matt Stone and Trey Parker (the creators of South Park), who use the idea of causality as a rule of thumb in their own writing: “If the words ‘and then’ belong between those beats [what the authors call self-contained scenes], you’re fucked–you have something pretty boring. What should happen between every beat you have written down is either the word ‘therefore’ or ‘but’” because the “but” is also (subverted) causality. “Writing Advice from Matt Stone & Trey Parker @ NYU: (2017). https://www.youtube.com/watch?v=vGUNqq3jVLg (June 30, 2023).
8 Illustrative of this rule is the website Spurious Correlations, which collects such doubtful correlations as “Per capita consumption of mozzarella” and “Yearly engineering degrees awarded,” overlaying two causally unrelated but similar-looking statistics, see https://www.tylervigen.com/spurious-correlations. Another example of confusing correlation and causation are forms of magical thinking like the “cargo cult”—the alleged hope of some Pacific island tribes to attract American airplanes and their cargo by constructing symbolic runways; there is a lively debate about whether this description is based on a misunderstanding by western observers, see Otto, Ton. “What Happened to Cargo Cults? Material Religions in Melanesia and the West.” Social Analysis, vol. 53, no. 1, 2009, pp. 82-102.
9 See Pearl, Judea and Dana MacKenzie. The Book of Why: The New Science of Cause and Effect, Basic Books, 2018.
10 See Fletcher, Angus. “Why Computers Will Never Read (or Write) Literature: A Logical Proof and a Narrative.” Narrative, vol. 29, no. 1, 2021, pp. 1-28. One voice that chimes in on this battle for the monopoly of human storytelling is Schönthaler, Philipp. Die Automatisierung des Schreibens und Gegenprogramme der Literatur, Matthes & Seitz, 2022.
11 Hume, David. An Enquiry Concerning Human Understanding, edited by Stephen Buckle, Cambridge University Press, 2007, p. 59.
12 Kant, Immanuel. Critique of Pure Reason, edited by Paul Guyer and Allen W. Wood, Cambridge University Press, 1998. Between Hume and Kant lies the reality of statistics, which both acknowledges that causality can never be established with absolute certainty and uses a probability calculus that is quite reliable when combined with randomized studies, see Spiegelhalter, David. The Art of Statistics. Learning From Data, Pelican, 2019, chap. 4.
13 This approach is based on the assumption that errors—both glitches and bugs—are able to reveal the fractures of a system analytically, and that this access favors artistic experiments in particular, see Bajohr, Hannes. Schreiben in Distanz. Hildesheimer Poetikvorlesung, Universitätsverlag Hildesheim, 2023, p. 55.
14 Practically, I was able to test this projection effect in my novel Durchschnitt, which was created through classical coding at the time: the mere alphabetical arrangement of average-length sentences from the corpus of Marcel Reich-Ranicki’s novel canon produces coherence effects that make unrelated elements sound like a narrative for a moment. Here is an example from Chapter D that creates narration solely through temporal adverbs like “then” and “afterwards”: “She put the box back in her desk and left the key in its usual place. Afterwards, one could see the starry sky in all its purity up to the edges of the heavy clouds in the northwest. But then Frau Permaneder-Buddenbrook began to cry loudly in the middle of the street and in front of so many people.” Bajohr, Hannes. Durchschnitt, Frohmann, 2016, p. 42. According to the only person I know who has read the book in its entirety—my publisher Christiane Frohmann—reading it causes headaches.
15 Kreuzmair, Elias. “Die Zukunft der Gegenwart (Berlin, Miami).” Digitale Literatur II, edited by Hannes Bajohr and Annette Gilbert, Edition Text + Kritik, 2021, pp. 35-46.
16 After completing the transcripts of the first version of this text, I was made aware that the German word for “jawling” is one of the many names of the Slippery Jack mushroom (which grows in symbiosis with pine trees) and there is a 499 meter high elevation called somewhere in the upper Lahn area called “pondhead”. Truly/indeed I cannot say with certainty whether or not “jawling” and “pondhead” are true neologisms—that language models regularly produce—or the effect of a mere context shift of known tokens. The incorrect gender of “pondhead” and the textual and semantic proximity of “Jaw” and “dentition” in the first sentence of the text suggest neologisms at the very least.
17 Kafka, Franz, and Nahum Norbert Glatzer. “The Cares of a Family Man.” The Complete Stories, translated by Willa and Edwin Muir, Schocken Books, 1983, pp. 469.