Exploring LLMs and AI: Connecting Neural Processors to Natural Language Learning | February 15 2026, 15:41

Some thoughts on LLMs and artificial intelligence in general. And in the end about neuromorphic processors and Intel Loihi.

As you all know, fundamentally LLMs operate on the principle of “propose the likely next word using the context from the previous N words,” and then the word enters the context, and the process repeats all over again for the next word. Well, and the context is also processed considering the importance of words.

Now let’s think about how children were taught languages in primitive societies. There were no alphabets, nor grammar. But the grammar itself, according to estimates, was quite complex—based on observations of the small languages of small peoples. Simple grammar is modern when the language has spread to millions and billions.

That is, a child’s brain had to reconstruct grammar in its neurons simply from the flow of speech from those around and through testing the understanding of what was said. It’s likely that the child was corrected if they spoke incorrectly, but somehow this grammar and sound extraction had to settle in the brain—and here the same mechanism as in LLMs is used: which words/sounds go next in what context is determined by latent and uninterpretable rules, which each person in childhood creates in their brain in their own way. That is, roughly speaking, it trains the ML model every time from scratch on the flow of speech from those around. A child does not know what a “case” is, but feels what ending is statistically more likely in a given context.

Actually, modern cognitive science (Karl Friston’s theory) asserts that the brain is literally a “prediction machine.” We constantly generate hypotheses about the next sound or word and correct them when they don’t match (prediction error).

The peculiarity of LLMs is that for them, teachers are texts and images, but for a child’s brain, it’s the living world around, and if all the texts they hear were digitized, their volume wouldn’t even be enough to train a very weak model. LLM sees the word “apple” next to the word “red.” A child sees an apple, feels its smell, taste, weight, and simultaneously hears the sound. This “stitching” of different sensory channels allows building neural connections thousands of times faster than on plain text. That is, modern LLMs take a brute force approach—simply observing the speech of billions, not just their immediate environment. A good question is how the human brain manages to learn from a relatively small dataset. However, it’s a big question whether this dataset is small—for example, lip movements, facial expressions, context provide a lot for building this neural network in the biological brain.

About the context: unlike LLMs, a child understands the speaker’s intention. If mom looks at a cup and says “hot,” the child’s brain limits the search space of meanings to one cup. And if he didn’t understand, he’ll get burned and remember.

One might assume, of course, that the brain already has a ready network at birth. It’s true, but science can’t yet explain it properly. Our entire genetic program has about 20,000 genes encoding proteins, and these 20,000 are responsible for everything—where and how the lungs, heart, bones, blood should be built, and they themselves are of mind-boggling complexity, and somewhere among 3 billion nucleotides and 20,000 genes this information must be recorded.

Apparently, genes encode not a map but an algorithm of self-assembly. Essentially, the architecture of the neural network is built dynamically, and this process begins long before birth. Then it is calibrated by all the signals received by the unborn child, and by the time of birth, there is already a somewhat tuned network in the brain.

It’s likely that the child’s brain is millions of neural networks of different “architectures” that evolve and merge in the learning process. Unlike LLMs, here learning and usage are strictly separated in time. But most importantly—the brain, although the most energy-consuming in the body, consumes very little energy in absolute terms, especially compared to the current “candidates for replacements in hardware.”

In the last few years, there has been active development in the field of neuromorphic systems (for example, the old IBM TrueNorth processor and the actively developing Intel Loihi). In conventional AI, neurons transmit numbers (0.15, 0.88…). In neuromorphic systems, they transmit “spikes” (impulses)—as in the living brain (and the architecture is called Spiking Neural Network – SNN). A few years ago, Intel released Loihi 2. Fully programmable. Neurons on Loihi can change their connections (synapses) right during operation. Supports plasticity—the very biological mechanism when the connection between neurons is strengthened if they often “fire” together. But the main thing—it consumes very little.

In this architecture, the model can continue learning “on the fly” right during operation, without forgetting old data (Continual Learning). Besides that—extreme energy efficiency.

Loihi 2 cannot multiply matrices as modern GPUs do, so completely new software has to be written for them (and this is moving very slowly). No PyTorch or TensorFlow—for Loihi there is only the Lava framework available today. And 1 million neurons from Loihi 2 is very little for LLMs. Therefore, Intel creates systems like Hala Point—it’s an array of 1152 Loihi 2 processors. It contains up to 1.15 billion neurons. Theoretically, in terms of performance per watt, such a system can surpass traditional GPUs by 10–50 times when working with AI models.

Experimental LLMs are already being launched on Loihi 2 (for example, models with 370 million parameters). They are not yet going to replace ChatGPT in the cloud, but theoretically, they are the future for “smart” robots and gadgets that need to understand human speech while running off a small battery.

We’ll observe. It might turn out to be a dud, or it could be another major revolution.

From Camels to Bishops: The Fascinating Evolution of Chess Pieces | February 14 2026, 16:24

It all started with a question – why does the elephant ♗ have this notch? And in general, where is the elephant, and where is the bishop, and is this notch about the elephant or the bishop? Anyway, listen to what I dug up, there’s a lot of interesting stuff here.

Chess originates from India. There, this figure was initially called a camel. And their elephant was what we call a rook – which if you think about it, a rook is basically a boat – or in English, rook, which if you think about it in Persian, it means chariot.

The name “Tura”, which we often hear in colloquial speech, is a pure import from Europe. In French – tour. In Italian – torre. In Latin – turris. All of these mean the same thing: tower. When chess arrived in Europe, knights and monks didn’t really understand what a “battle chariot” was (they were out of fashion by then), but they knew very well what a siege tower was.

So, returning to the elephant and the notch.

The short answer – to distinguish it from a pawn. But there’s a long answer.

When chess came to Europe, the Indian camel was switched to the Catholic bishop, and thus the piece was named bishop. The notch supposedly symbolizes a miter – the high headgear of clergymen. That’s precisely why in English the piece is called bishop. Though to me, it’s just a mouth from the Muppet show.

Interestingly, in French, it’s le fou – the jester. In German, it’s Läufer – runner. In Greek – officer (Αξιωματικός). Why officer? I don’t know, but I dug up that in Chinese chess, xiangqi (象棋), the “elephant” piece is indicated and pronounced as xiàng (象). This character indeed means “elephant.” However, in Chinese history, there was a high state office called xiàng (相), usually translated as “chancellor,” “prime minister,” or “chief minister.” This is a different character, although the pronunciation coincides. Probably, the officer comes from here too.

The chess knight is almost a horse in all languages, only in English and a few others, it’s a knight (although, in German, for example, it’s Springer – jumper, and in Sicily – donkey).

So, in German, there is a jumper and a runner. And a little horse in German is actually a king.

I also learned that there are ready-made solutions for ANY chess endgame in which there are seven or fewer pieces on the board, regardless of the position, the composition of the remaining pieces, or possible moves. This information, known as endgame tables, currently occupies 18.4 terabytes.

from the comments: “The most interesting thing is that this week a multi-year work was completed, and there is now a ready solution for any position with 8 pieces or fewer (7 pieces was already about 12 years ago, but there’s a very big difference)”

Chris Pratt’s Race Against AI in “Mercy”: A Cinematic Journey | February 10 2026, 16:24

We went to see the movie Mercy with Chris Pratt yesterday. Bekmambetov! His “screenlife” format has finally been expanded into a $50 million blockbuster and stuffed into IMAX. The guy really did well. First, he made six Yolki movies, and then, bam – he broke out and even started to produce something decent. (We were alone in the theater in super comfy motorized chairs. Empty halls — that’s pretty much the norm for the last many years. I don’t know how cinemas even break even. Even the bar was closed, it only works on weekends when more than two people show up to a hall)

So, the plot. The near future. The justice system is maximally optimized: instead of jurors and years of appeals — an impartial AI. The main character (Chris Pratt) is accused of brutally murdering his own wife. The evidence against him is significant, and society demands blood.

He is placed in a high-tech chair and given 90 minutes. This window” for defense — the time in which he must convince the algorithm of his innocence. If after an hour and a half the guilt probability” scale doesn’t drop below a critical threshold — he will be executed right there. Everything happens in real time, the movie runs for 90 minutes.

In the era of neural networks, this seems very timely. Screenlife here is ideal: we see the evidence and the world through the system’s eyes via cameras and browsers. Chris Pratt and Rebecca Ferguson on screen — always a plus.

However, what causes doubt is the attempt to crossbreed a hedgehog with a snake. Screenlife is good for its chamber feel, but here they sell us IMAX 3D, explosions, and chases, although 95% of the time the hero just sits in a chair.

Classic cinema for streaming. Not bad. On the couch with pizza on a Friday night — it’ll be great, there’s a solid detective story. Your brain might explode from the overload of details. Big question whether it’s worth paying for an IMAX ticket to watch Pratt watching a monitor… Who knows. There are some action scenes here and there, and they’re pretty good, but only occasionally.

Overall, detective fans should like it. From the plot, it’s clear they won’t fry the guy in the chair at the end of the movie, the question is how he’ll manage to wriggle out of it.

Exploring Algorithmic Stylization in Plotter Art: A CMYK Fractal Journey | February 01 2026, 04:18

Now that I have a plotter, I am fully experimenting with ways of algorithmic image stylization. To achieve what is attached, a Minimum Spanning Tree algorithm was used. Essentially, it converts an image into stochastic rasterization – that is, where it’s darker, there are more dots, and then connects the dots with lines so that all points are connected in a single network, the total length of all lines is minimal, and there are no closed loops (meaning it’s precisely a “tree” with branches, not a “web”).

And this is what I do with each of the CMYK channels, then combine the result into a color picture. On this picture, there seem to be no other colors except for these four CMYK ones, but in reality, there is a bit because some smoothing has crept in.

Printing such on a plotter, of course, is difficult, I will be waiting forever, but I am getting the hang of it, I have already printed the first color picture (it turned out so-so. Well, the first pancake is always lumpy. Comments below)

Building a Plotter from Scratch: My DIY Journey | January 30 2026, 05:43

I assembled a plotter from a kit. It’s practically a Lego set – you spill out the parts from the box and then read the manual. It worked right away. I have some ideas about what to do with this thing, I’ll tell you sometime.

Decoding Naval Terms: From “Eskadrenny Minonosets” to “Destroyer” | January 28 2026, 21:57

It turns out that a destroyer is an abbreviation of “squadron mine carrier” and that in English these ships are called destroyers.

Jet Trails as Weather Predictors: A Phenomenon of High Altitude Humidity | January 24 2026, 02:34

Walking with Yuki, I see across the sky a very distinct and narrow streak clearly (apparently, an airplane had passed by), and usually a contrail disappears quite quickly, but today it is unusually sharp and long.

I started to investigate and it turns out this is a reliable indicator of changing weather, specifically the arrival of snow or rain: as we are actually expecting a sudden knee-deep snowfall tomorrow. In short: the airplane trail acts as an indicator of humidity at high altitudes.

Here’s how it works:

For a contrail not to evaporate but to start “smearing”, the air at an altitude of 8–10 kilometers must be very humid (saturated with moisture). If the air is dry, the ice crystals from the engine quickly turn into invisible vapor (sublimate). If the air is moist, the crystals have nowhere to evaporate. Instead, they start attracting extra moisture from the surrounding environment and grow. High humidity at high altitudes is a sure sign of an approaching warm atmospheric front.

My Ambitious 2026 Plan: From Galapagos Travel to Academic Achievements and Creative Pursuits | January 20 2026, 04:44

My plan for 2026:

– Travel to the Galápagos Islands, Ecuador for a week (summer)

– Finish and release a book on Information Retrieval (also summer, progressing slowly, first couple of chapters are already written. Already spent about 50-100 hours on this, the easy part)

– Release at least one scientific paper, probably on Data Mining (spring). Ideally, submit it somewhere to a journal (challenging). Already spent about 30 hours on this topic, a lot left to do.

– Make a step towards a PhD. Find professors, visit universities, understand the cost and assess my capabilities and resources.

– Continue studying fundamental mathematics and not die (linear algebra, calculus, probability theory, statistics, classical ML). In 2025, I spent about 200-400 hours on this topic.

– Continue studying Deep Learning and reach the “can teach” level. In 2025, I spent about 100-200 hours on this topic.

– Continue studying Data Mining/NLP.

– Update my book on RecSys, releasing version 2.0 with updates and corrections (autumn 2026)

– Make noticeable progress in painting and playing the piano. Specifically, learn Schubert’s serenade (Ständchen, D 889) completely and create at least one canvas that I wouldn’t be ashamed to give as a gift.

Unveiling Scientific Misnomers: A Cross-Cultural Exploration | January 14 2026, 04:46

Today I was surprised to learn that the Coriolis force is pronounced as CoriolIs force, not coriOlis force as we were taught in school. I started to investigate what else was wrong, and discovered something amazing.

It turns out what we called Gay-Lussac’s law is known as Charles’s Law in the rest of the world, and what we called Charles’s Law is known throughout the world as Gay-Lussac’s Law.

The Cartesian coordinate system here is Carthesian. Cartesius is just the Latinized name of René Descartes.

In our textbooks, the law of conservation of mass is called the Lomonosov-Lavoisier Law (what enters the chemical reaction = mass of the substances formed). In the rest of the world, it is exclusively the Law of Lavoisier (Lavoisier’s Law). Lomonosov got included here only because “whatever is taken from one body is added to another”.

Also, it turns out that if you have to explain Pythagoras’ theorem to someone in English, without a hint, it’s absolutely impossible to guess that it’s Pythagoras. Greek names are generally a mess. Thales here is pronounced as Teelis.

For some reason, in physics Roentgen is called RentgEnom, although it’s Röntgen with the emphasis on ö.

In Russia, a trapezoid is a quadrilateral with two sides parallel and two not. In the USA, our trapezoid is known as Trapezoid, and the word Trapezium here refers to a quadrilateral with no parallel sides at all. In the UK, it’s the opposite. Our trapezoid is Trapezium, and the “skewed” quadrilateral is Trapezoid.