Kevin Smith
3 min read • 25 February 2025
🔗 Originally published on LinkedIn
There’s a common refrain when people talk about large language models (LLMs): "They just predict the next word."
Technically, that’s true. But it’s also a massive oversimplification—one that ignores the sheer depth of computation and reasoning happening under the hood.
The key to understanding why LLMs do more than blindly guess words lies in something called latent space—a hidden, high-dimensional landscape where meaning, relationships, and reasoning come to life.
Saying "LLMs just predict the next word" is like saying: "A chess grandmaster just picks the next move."
Let’s break it down.
Imagine a library. Not an ordinary one, but a vast, ever-expanding, magical library. Instead of sorting books alphabetically or by genre, this library organises them by meaning and relationships.
This is what latent space is. It’s not a collection of facts stored in neat little boxes. Instead, it’s a landscape of interconnected ideas where concepts live in proximity based on their semantic relationships.
When an LLM generates text, it isn’t just picking the next word randomly—it’s navigating this latent space, tracing paths between concepts, and reasoning through relationships that emerge from patterns in data.
If LLMs were merely predicting words one by one without deeper context, their outputs would feel disjointed, like a bad autocomplete function. Instead, they produce coherent, structured thoughts—sometimes better than humans do.
This happens because the "prediction" step is actually a reasoning process:
When an LLM chooses the next word, it isn’t just pulling from a list of common phrases. It’s traversing a complex semantic landscape where words and ideas are positioned based on their meaning.
For example, if you ask: "What happens when you mix vinegar and baking soda?"
The model doesn’t just recall the phrase "It creates a reaction." It understands that:
The next word isn’t just predicted based on syntax—it’s chosen based on the model’s latent understanding of chemistry.
If you ask about a completely new idea—like “What would happen if the Roman Empire had access to AI?”—there’s no direct example in the training data. But the model doesn’t need an exact precedent.
Instead, it can:
This is why LLMs feel like they can “think” in a way that simple auto-complete systems cannot.
Since latent space encodes relationships between concepts, LLMs can do things that resemble reasoning, such as:
These are not memorised responses but inferences drawn from traversing latent space. The model isn’t just predicting words—it’s choosing words based on conceptual coherence.
Saying "LLMs just predict the next word" is like saying: "A chess grandmaster just picks the next move."
Yes, technically true. But the move isn’t random—it’s chosen through deep understanding, strategic foresight, and pattern recognition. The same applies to AI models.
As LLMs evolve, their ability to reason through latent space will become even more sophisticated. Future advancements will likely:
The bottom line? AI isn’t just regurgitating words. It’s navigating a hidden world of meaning, where every word is chosen based on deep, structured relationships.
And that’s a whole lot more than just "predicting the next word."
This article was originally written and published on LinkedIn by Kevin Smith, CTO and founder of Dootrix.
Kevin Smith