I'm Breaking Up With LLMs. Here's Why.
Friendship ended with large language models. Now large concept models are my best friend.
Key Takeaways
Large Content Models Represent a Paradigm Shift. Unlike Large Language Models (LLMs), which generate text probabilistically word by word, Large Concept Models (LCMs) operate at a higher semantic level, potentially mirroring human cognition more closely.
Generative Grammar vs. Cognitive Linguistics. The emergence of LCMs aligns more with Cognitive Linguistics, emphasizing meaning and context over strict syntactic rules, though both paradigms may still complement each other.
AI Evolution vs. Human Evolution. While human cognition preceded language development, AI is being developed in reverse—starting with language (LLMs) and progressing toward conceptual cognition (LCMs), raising philosophical questions about intelligence and communication.
The Break Up
Since ChatGPT’s public debut in 2022, Large Language Models (LLMs) have been the dominant tool in the artificial intelligence space. The novelty, perceived efficacy, and ever-accelerating rate of improvement of LLMs have made it such that one can hardly roll out of bed without being bombarded by AI-generated or AI-related content.
At R: On Everything, for example, we’ve explored what ChatGPT 4o knew about my work as an author and the what and how of what it got right and wrong. We’ll soon even rig up a whole experiment that challenges an LLM, our audience, and yours truly to test our abilities to compete with and identify AI-generated content.
But LLMs have also been a bit of a drag:
They’ve gaslit us about the number of Rs in the word “strawberry.”1
They refuse to acknowledge David Mayer. An ex, maybe? Seems sus.
They’ve spent endless hours hallucinating, and they refuse to see a doctor.
They’ve also—not once!—offered to do the dishes after dinner.
So, yeah: I think a break up is in order.2
Conveniently, Meta recently published the results of their initial foray into architecting artificial intelligence driven by what they’ve dubbed a Large Concept Model, or LCM.
According to Meta, Large Concept Models distinguish themselves from LLMs by mirroring more closely the inner workings of the human mind, operating at “explicit higher level semantic representations.”
In other words, they’re smart. They’re cool. They’ve got that look in their eye that says, “You want to get to know me, don’t you?”
And I do. I really do.
The true contrast between our ex and our new best friend is that rather than generate responses token-by-token like LLMs do, LCMs do so at the “concept” level, which Meta defines as “language- and modality-agnostic” representations of a “higher-level idea or action in a flow.”
What they mean, basically, is sentences. Rather than generate their outputs word by word3, LCMs first construct a high-level, conceptual representation of meaning that is then realized one sentence at a time4.
This is a fascinating approach with potentially widespread spillover effects for other fields, including linguistics and our understanding of human cognition. To tease out these prospective consequences, I’ll offer two assertions, both of which we’ll explore in further detail below.
The advent of LCMs creates further tension between longstanding linguistic paradigms, namely Generative Grammar and Cognitive Linguistics.
The evolution of artificial intelligence is mirroring that of humankind’s capacity for thought and language—but in reverse.
Questions to Consider
As you read, consider the following and answer in the comments.
Do you think the shift from LLMs to LCMs will fundamentally change how we interact with AI, or is it just another incremental improvement?
If AI had been cognition-first instead of language-first, how might that have shaped its role in our society?
Meet the Parents: Generative Grammar vs. Cognitive Linguistics
Generative Grammar and Rules-Based, Syntax-Over-Semantics Outputs
At some point in any relationship, it’s time to meet the parents.
In the case of LLMs, there’s a lot to love about their parentage—at least with respect to the field of linguistics, where I see a great deal of overlap between Noam Chomsky’s Generative Grammar and the inner workings of LLMs.
Generative Grammar has been the dominant force in the field of linguistics going back to the 1950s, when Chomsky made the initial arguments that would later define decades of research. The core tenets of his Generative Grammar relate to the innateness, modularity, and rule-based structures of human language, which manifest or are represented in the following ways5.
Language as an Autonomous Module
Language is distinct from general cognition and, as a faculty, is an innate mental structure.
Universal Grammar (UG)
All human languages share an underlying grammatical structure, no matter how unique they might appear on the surface.
Syntax Sets Direction
Syntax operates independent of meaning. Deep linguistic structures are transformed into surface structures via the formal rules one comes to understand when acquiring a language.
There are immediately a number of interesting parallels between Generative Grammar and the inner-workings of LLMs.
When an LLM (or, hey, generative AI) delivers human-consumable outputs, it does so in a way that is distinct from general cognition; it’s just making probabilistic guesses about what should come next in a string of text.
In the linguistic sense, then, LLMs appear to be productive, but they only mimic human language generation and resemble grammatical, rule-following behavior; they’re not doing anything one could call thinking under the surface.
This decoupling of language from thought has direct tie-ins to the three tenets of Generative Grammar described above. For an LLM, language (or, truthfully, token generation) is a modality unto itself, informed by scraping the internet for books, blog posts, comment sections, and message boards. In doing this, it develops the appearance of understanding humankind’s underlying grammatical structures and their surface-level manifestations.
What I’m saying is LLMs are, in a way, sophisticated, try-hard content generators.
They’re so last summer, and we are never, ever getting back together.
Cognitive Linguistics: Meaning, Usage, and Abstraction
Forged in the 1970s, Cognitive Linguistics is essentially a rebel’s response to the formal, rules-focused approaches of the 1950s, when Generative Grammar first emerged.
In other words, if Generative Grammar came back from the war to settle in the suburbs and have five kids, Cognitive Linguistics is those kids.
And those kids? Well, they became the counterculture their parents feared. In our primary metaphor, they’re also the parents to the Large Concept Model—our nouvelle belle, as it were.
To demonstrate the divergence in these paradigms, consider that where Generative Grammar declares meaning is a byproduct of underlying syntactic rules, Cognitive Linguistics treats meaning as central to language. For Cognitive Linguistics, it’s semantics, not syntax that makes language worth studying, which has a significant impact on the defining principles of Cognitive Linguistics, described below.6
Language as a Cognitive Phenomenon
Unlike in Generative Grammar, language is not distinct from general cognition and is instead closely tied to—and an extension of—one’s experience of the world.
Usage, Not Universality
Rather than accept that language is governed by innate rules, Cognitive Linguistics argues our understanding and production of language is driven by conceptual metaphors and exposure-dependent statistical generalization.
Schema as Semantic Structures
Instead of treating them as distinct modalities, Cognitive Linguistics views syntax and semantics as intertwined. Conceptual metaphors and mental representations of our experiences become part of internalized schema that are used to derive, create, and share meaning.
In a way, Cognitive Linguistics and, as an extension, Large Concept Models ”zoom out” relative to Generative Grammar. By emphasizing structured conceptual representations instead of tokenized language-oriented associations, LCMs should theoretically be encoding meaning at a more semantic and functional level instead of a modularized, rules-based, probabilistic one.
This zooming out is also what would unlock an LCM’s ability to embrace conceptual metaphor. The very notion of context—which I’ll argue is necessary to create and sustain meaningful, metaphorical thought—is far more available to an LCM by virtue of its operation at the higher level of abstraction its creators aim to instill within it.
The consequences for this prospective pivot to LCMs aren’t exclusively theoretical, either. Practically speaking, a shift to response generation at the concept level should reduce hallucinations and make AI tooling more reliable for research, writing, and coding.
Their improved capacity for maintaining context over long conversations or longer bodies of text should also ensure their responses are more likely to remain internally consistent, minimizing the frequency with which the LCM contradicts itself or “forgets” an instruction or vital bit of its own response history.
In keeping with our break up metaphor, what I’m trying to say is LCMs are smart. They’re cool. They just get it.
They just get me.
LCMs and Linguistic Lore: A Penchant for Tension
What LCMs don’t get, however, is that their advent would seem to suggest Cognitive Linguistics should win the day.
As AI moves toward more meaning-driven, usage-based models, the principles of Cognitive Linguistics become increasingly relevant, potentially drowning out the perceived need for or utility of the Generative Grammar paradigm.
That said, Generative Grammar has survived as long as it has for a reason. Its formal structures have had an immense impact on the field of computational linguistics, and its deductive power—that it can explain how diverse linguistic phenomena can be derived from a core set of principles—has made it indispensable in understanding how humans acquire language.
Ultimately, then, we may see both Generative Grammar and Cognitive Linguistics playing well together, with the former providing the foundations of AI-generated language at a deep, structural level, rather than in a merely probabilistic capacity. The latter, meanwhile, would subsequently improve outputs by incorporating context-dependent, conceptual models.
Huh. Maybe one’s ex and one’s new fling can be friends.
Maybe.
Digital Evolution !== Human Evolution
This is where we get glonky, by which I mean speculative. It’s the philosophical jazz of the LLM-LCM interplay, if you will.
Anyway—
What struck me about the notion of LCMs and how they are positioned to succeed LLMs is that the former is an attempt to go from a language-focused model to one that closer resembles thought, or at least what we believe to be the stuff of thought.
This is the opposite of how we humans went about it evolutionarily7.
Think about it: what need would we have had for language if there weren’t anything going on between our ears worth sharing? By this, I mean for humans, thought preceded language. Humans’ machines, however, will have first had language, then thought, to the extent that they ever truly have either.
I’m of the mind that this demonstrates the extraordinary influence language has on how we navigate and shape our world. One wonders, for example, at the form these tools might have taken if humans communicated with one another by directly sharing our thoughts with each other as they exist in the abstract.8
Might this have led us to first develop a model that one could engage with by non-linguistic means? Might we have then secondarily pivoted to a model that relied on language in order to communicate over longer distances? Or would the need for long-distance communication have necessitated an LLM equivalent first anyway?
As with any thought experiment, there’s truly no way to know, and the answers to these questions are likely to rely on the parameters one establishes.
All of this does make me curious, though, so let’s keep up the conversation in the comments.
Tell Me in the Comments
Do you think the shift from LLMs to LCMs will fundamentally change how we interact with AI, or is it just another incremental improvement?
If AI had been cognition-first instead of language-first, how might that have shaped its role in our society?
This article explains why ChatGPT and other LLMs refused to accept that there were three Rs in “strawberry.”
For the record, I’m not really breaking up with LLMs. LCMs as Meta describes them don’t yet exist as a consumer product. Discarding the current paradigm would therefore be imprudent.
“Word-by-word” is a simplification. In this post, I detailed how LLMs probabilistically generate the next token—or part of a word—in their responses.
Note that Meta’s abstract cites they are, for now, assuming that a concept corresponds to a sentence in the interest of short-term feasibility. The notion of a concept could easily be expanded to include more than a single sentence in future research.
The three points laid out in this post represent only a subset of Generative Grammar’s foundational principles, but they encompass them sufficiently well for a post of this length and focus.
Much like the presentation of Generative Grammar that preceded it, these principles account for only a few of the pillars of Cognitive Linguistics.
Yes, this sentence is written as if humans played an agentive role in their own evolution. I know that’s not how evolution works; it just works as a turn of phrase.
Telepathy, basically.
Daily I read breathless articles about many facets of AI and most of these articles while interesting, leave me cold. I just can’t get excited about any of it. While it’s cool what AI can do I feel we are a long way off from using it effectively. The hype level and obscene amounts of money invested feel a lot like the dot com bubble. I’m happy to sit back and see how things shake out. I have zero interest in using AI for my own writing and what I’ve seen so far of AI writing has been technically correct, though so bland and voluminous I don’t have the patience to read it. It’s just not engaging.
As far as the difference between LLM and LCM, I can see why you’re drawn to the newer model but it doesn’t sound like there’s much available yet. I’m happy to be a Luddite a bit longer and see what happens before I get invested in AI.
In the meantime I’m happy to read stuff by guys like you trying to make sense of it. No doubt I’ll learn something along the way.
Thanks
I'm with Bruce. I have zero interest in using AI for my own writing.