Some thoughts on LLMs and artificial intelligence in general. And in the end about neuromorphic processors and Intel Loihi.
As you all know, fundamentally LLMs operate on the principle of “propose the likely next word using the context from the previous N words,” and then the word enters the context, and the process repeats all over again for the next word. Well, and the context is also processed considering the importance of words.
Now let’s think about how children were taught languages in primitive societies. There were no alphabets, nor grammar. But the grammar itself, according to estimates, was quite complex—based on observations of the small languages of small peoples. Simple grammar is modern when the language has spread to millions and billions.
That is, a child’s brain had to reconstruct grammar in its neurons simply from the flow of speech from those around and through testing the understanding of what was said. It’s likely that the child was corrected if they spoke incorrectly, but somehow this grammar and sound extraction had to settle in the brain—and here the same mechanism as in LLMs is used: which words/sounds go next in what context is determined by latent and uninterpretable rules, which each person in childhood creates in their brain in their own way. That is, roughly speaking, it trains the ML model every time from scratch on the flow of speech from those around. A child does not know what a “case” is, but feels what ending is statistically more likely in a given context.
Actually, modern cognitive science (Karl Friston’s theory) asserts that the brain is literally a “prediction machine.” We constantly generate hypotheses about the next sound or word and correct them when they don’t match (prediction error).
The peculiarity of LLMs is that for them, teachers are texts and images, but for a child’s brain, it’s the living world around, and if all the texts they hear were digitized, their volume wouldn’t even be enough to train a very weak model. LLM sees the word “apple” next to the word “red.” A child sees an apple, feels its smell, taste, weight, and simultaneously hears the sound. This “stitching” of different sensory channels allows building neural connections thousands of times faster than on plain text. That is, modern LLMs take a brute force approach—simply observing the speech of billions, not just their immediate environment. A good question is how the human brain manages to learn from a relatively small dataset. However, it’s a big question whether this dataset is small—for example, lip movements, facial expressions, context provide a lot for building this neural network in the biological brain.
About the context: unlike LLMs, a child understands the speaker’s intention. If mom looks at a cup and says “hot,” the child’s brain limits the search space of meanings to one cup. And if he didn’t understand, he’ll get burned and remember.
One might assume, of course, that the brain already has a ready network at birth. It’s true, but science can’t yet explain it properly. Our entire genetic program has about 20,000 genes encoding proteins, and these 20,000 are responsible for everything—where and how the lungs, heart, bones, blood should be built, and they themselves are of mind-boggling complexity, and somewhere among 3 billion nucleotides and 20,000 genes this information must be recorded.
Apparently, genes encode not a map but an algorithm of self-assembly. Essentially, the architecture of the neural network is built dynamically, and this process begins long before birth. Then it is calibrated by all the signals received by the unborn child, and by the time of birth, there is already a somewhat tuned network in the brain.
It’s likely that the child’s brain is millions of neural networks of different “architectures” that evolve and merge in the learning process. Unlike LLMs, here learning and usage are strictly separated in time. But most importantly—the brain, although the most energy-consuming in the body, consumes very little energy in absolute terms, especially compared to the current “candidates for replacements in hardware.”
In the last few years, there has been active development in the field of neuromorphic systems (for example, the old IBM TrueNorth processor and the actively developing Intel Loihi). In conventional AI, neurons transmit numbers (0.15, 0.88…). In neuromorphic systems, they transmit “spikes” (impulses)—as in the living brain (and the architecture is called Spiking Neural Network – SNN). A few years ago, Intel released Loihi 2. Fully programmable. Neurons on Loihi can change their connections (synapses) right during operation. Supports plasticity—the very biological mechanism when the connection between neurons is strengthened if they often “fire” together. But the main thing—it consumes very little.
In this architecture, the model can continue learning “on the fly” right during operation, without forgetting old data (Continual Learning). Besides that—extreme energy efficiency.
Loihi 2 cannot multiply matrices as modern GPUs do, so completely new software has to be written for them (and this is moving very slowly). No PyTorch or TensorFlow—for Loihi there is only the Lava framework available today. And 1 million neurons from Loihi 2 is very little for LLMs. Therefore, Intel creates systems like Hala Point—it’s an array of 1152 Loihi 2 processors. It contains up to 1.15 billion neurons. Theoretically, in terms of performance per watt, such a system can surpass traditional GPUs by 10–50 times when working with AI models.
Experimental LLMs are already being launched on Loihi 2 (for example, models with 370 million parameters). They are not yet going to replace ChatGPT in the cloud, but theoretically, they are the future for “smart” robots and gadgets that need to understand human speech while running off a small battery.
We’ll observe. It might turn out to be a dud, or it could be another major revolution.

