Exploring Word Clusters in Religious Texts from Gutenberg’s Library | May 02 2026, 03:28

It’s interesting that if you take 8000 books from the Gutenberg library and construct a graph for each based on word connections to see how “friendly” words are—if word A often appears with B, and B with C, then how often does A appear with C? There’s a metric for this—the average clustering coefficient. Then, simply sort the books by decreasing this coefficient, about 70 percent of the top will be religious books—bibles, the Book of Mormon, the Quran. Well, some of them are duplicates in a sense, because a Bible in different formats remains the Bible. But clearly, its different parts are grouped together, meaning, they definitely share commonality in these triangular words.

But what unites all the books in this top— is that they were written many years ago or, as in the case of The Night Land, written relatively recently in the same style as many years ago.

By the way, among these books shines An Introductorie for to Lerne to Read, To Pronounce, and to Speke French Trewly. This is a French language textbook, written in English during the Tudor times (around the 1530s). Soverayn lorde kyng Henry the Eight. It was written by Gilles Du Guez—a French teacher at the English court. This particular textbook was compiled for Princess Mary (the future Queen Mary I, known as “Bloody Mary”), the daughter of Henry VIII. Check out a page from the textbook. Very cool English 🙂 …ye must pronounce it letyng your lippes jointe close, so that there be but a lyttell hole in the middes.

So, I delved into this textbook. It mentions a fruit called “openarses.” As you understand, this is “open arses” in English. In Tudor England, they called a medlar an openarse. If you Google what a medlar looks like, you’ll have no questions why it’s called openarses 😉

In the anatomical section (MEMBRES LONGYNG TO MANNES BODY), the author mentions next to the eyes and ears “the nether beerde” (literally— “the lower beard”).

Peripheral Vision: Unveiling Optical Illusions in News Apps | April 29 2026, 17:56

I’m trying to figure out if it’s just me or do other people experience this too 🙂 if you look anywhere except at the word “Omurbekova”, the line highlighted in red in the second screenshot (which is actually white) is distinctly visible in your peripheral vision. But as soon as you shift your gaze directly to it, the line disappears. That is, it’s only visible peripherally. Share your experiences 🙂

Navigating the Depths of High-Dimensional Spaces | April 13 2026, 23:17

I am now working a lot with high-dimensional vectors, and some things that I hadn’t fully realized before are really starting to tickle my brain. Our 3D intuition doesn’t just not work there—it lies.

It turns out that any two random vectors in high-dimensional space are almost certainly nearly perpendicular to each other. Almost all the space is one continuous “equator”.

Much of machine learning is built on exactly this. If your embeddings suddenly show high cosine similarity (for example, 0.8 — this is not a statistical error, but a powerful signal. It’s almost impossible to randomly converge like this in a 1000-dimensional world.

In such spaces, almost all the mass of data is concentrated in an extremely thin surface layer. The “insides” of objects are mathematically empty.

This can be easily verified with such an imaginary example. Take the “skin” of a multidimensional sphere with a thickness of just 1% of the radius. The volume of the sphere is proportional to the radius raised to the power of its dimensionality.

• In three-dimensional space, the pulp (0.99 of the radius) occupies 97% of the volume, you raise 0.99 to the third power.

• In 1000D, the pulp occupies just 0.000043%.

You can understand it differently. For a point to be closer to the origin, it requires that along all axes the coordinates need to be close to the origin. If one axis has a high value, that’s it, the point has gone. If you take points randomly, the mere probability that they all at once will be below any value decreases with the growth of dimensionality, and decreases quickly.

All the “meat” of the data always ends up in the skin. Any sample in High-D is essentially a set of boundary values.

For white noise in high dimensions, the distance between the closest and the farthest neighbor becomes almost the same. The concept of “closeness” simply degrades.

Navigating the Lexical Complexity of Nabokov’s “Lolita” | April 02 2026, 15:56

I’ve finished the first version of a dictionary-style book on Nabokov’s “Lolita”. The chart shows how the complexity of vocabulary is distributed across the pages of the book. The lower chart averages 25 sentences, displaying the number of complex words on the vertical axis, with colors indicating their complexity/rarity (purple – the most complex, red – less complex, yellow – even less so). But I have already removed two levels, and overall, for a foreigner, all five levels are challenging. In the book, level 3 is marked with a dashed line, level 4 with a simple frame, and level 5 with a double frame. Currently, there are 5794 words, of which 541 are fifth level, 1070 are fourth, 1883 are third, 1393 are second, and 54 are first (the simplest ones). Considering that the first version ended up being 1148 pages, the dictionary will need to be significantly streamlined by removing what can be dispensed with. This mainly pertains to the first and second levels, and some from the third and fourth. The rarity of words is calculated in three ways: through LLM, and through two lists of word frequencies in the English language corpus (300K words).

Not all words are complex. For instance, in the sentence “With the ebb of lust, an ashen sense of awfulness, abetted by the realistic drabness of a gray neuralgic day, crept over me and hummed within my temples.” someone well-acquainted with English might not know the words ebb, abet, drabness, while everything else is familiar, but lower the requirements for the reader, and the dictionary might not be very useful for such cases.

Or consider the sentence:

Homo pollex of science, with all its many sub-species and forms; the modest soldier, spic and span, quietly waiting, quietly conscious of khaki’s viatric appeal; the schoolboy wishing to go two blocks; the killer wishing to go two thousand miles; the mysterious, nervous, elderly gent, with brand-new suitcase and clipped mustache; a trio of optimistic Mexicans; the college student displaying the grime of vacational outdoor work as proudly as the name of the famous college arching across the front of his sweatshirt; the desperate lady whose battery has just died on her; the clean-cut, glossy-haired, shifty-eyed, white-faced young beasts in loud shirts and coats, vigorously, almost priapically thrusting out tense thumbs to tempt lone women or sadsack salesmen with fancy cravings.

My browser even highlights four words here.

I have definitions of words in English, German, French, and Russian. I’ve encountered the issue that different words from the text are considered complex in different languages, yet they are unified for me. So, I’ll have to mark, for example, French words in the English text separately, so they are not included in the French version, since there, the reader knows, for instance, what quel mot means.

Overall, this weekend I’ll be manually removing about half, and then I can make the cover and list it on Amazon.

Evolution of Understanding: Brain as a Predictive Model | March 18 2026, 13:29

An interesting philosophical thought came to my mind. What if evolution doesn’t exist in us (not in biological life), but in our system of understanding the laws of the world 🙂 That is, the system of understanding the laws of the world adapts itself so that everything more or less matches up. That is, the brain constructs an internal hallucination and constantly suppresses it in order to minimize the error of prediction. And there’s a big question — does our understanding system strive for truth (absolute correspondence to the world) or just for comfort (so that the picture in the head does not fall apart)?

With this approach, there’s a problem that if you don’t look into the future, then at each iteration, the understanding system adjusts its model so that the prediction works, but simultaneously creates problems for the next iteration, because it has to account for them already. As a result, this layered pie accumulates contradictions and constraints to such an extent that each subsequent theory becomes more and more complex and accreted with a multitude of unexplainable gaps. Dark matter, black hole radiation, gravitational waves, and so forth appear to somehow stretch the owl to fit the globe.

But yes, this is related to the question of whether mathematics was discovered or invented.

The Curious Etymology of the Turkey: Naming Perceptions Across Languages | March 09 2026, 21:36

I wondered why turkey is called turkey here and what it’s called in Turkey. In Turkey, it’s called hindi – turkey! Decided to see what it’s called in India. Haha, in Hindi, it’s called Turkish (टर्की). Let’s see in other languages. Portuguese – Peru. That means, for them, it’s Peruvian. In Spanish – pavo, which refers to peacock 🙂 “pavone” in Italian – peacock. In French – dinde, because this bird came from the West Indies (America). Comes from poule d’Inde – “hen from India/West Indies”. Greek – “Γαλοπούλα” “French bird”.

Gravitational Mastery: Semikhatov’s Cinematic Triumph | March 09 2026, 14:56

Semikhatov’s movie about gravity turned out to be really cool. Of course, it’s quite popular, but understandably so – they didn’t want to scare off the audience. It’s very cool and professionally made.

I have Semikhatov’s book on my shelf (“Everything That Moves”). It’s also popular, but it’s a bit more serious in its presentation, at times with formulas and loaded with illustrations. Later, my opinion of him slightly soured due to his specific way of conducting podcasts, constantly interrupting guests and answering his own questions in a way that outshines the guest demonstratively. But in the movie, he looks absolutely great. I recommend it.

The link is in the first comment.

Exploring Redundancy in Toponymy: From European Rivers to the Hill of Hills | March 08 2026, 02:54

Reading Nabokov, there “…with the dash of the Danube in his veins…”. Turns out, Danube is Дунай. But that’s okay, trivial stuff, the interesting thing is something else. That Don, Danube, Dniester, Dnieper, Donets, Dvina, and Disna essentially mean more or less the same thing – river. Apparently, the ancient people were not always rich in imagination when it came to toponymy. If you live by the water, you simply call it “River”. Over time, others came, heard this word, took it as a proper name, and altered it slightly to fit their accent. This way “River” (Danu) transformed into a dozen different names across the map of Europe.

The river Volga essentially is also just “river”. Okay, slightly different, “Volga” comes from the Proto-Slavic *Vòlga, which literally means “moisture” or “water”.

Also, it turned out that the Sahara desert is named so because Sahara (الصحراء) is desert. And the Gobi desert is called Gobi because Gobi in Mongolian is desert.

While googling, I stumbled upon another fun thing. There’s a place in England, Torpenhow Hill. The name is composed of four different linguistic layers: Tor — in Old English “hill”, Pen — in Cumbric “hill”, How — in Old Norse “hill”, Hill — in modern English “hill”. Result: “Hill-hill-hill-hill”. Likely, each new people arriving in this area didn’t understand that Tor, Pen, and How were already names for the hill, and added their variant of the word “hill”.

Exploring the Mystical Connection Between π² and g in Defining a Meter | March 01 2026, 17:11

It turns out that π² ≈ g is not some mystical coincidence. When the first scientists contemplated the definition of the meter, there was one elegant proposal: to make the meter equal to the length of a pendulum that takes exactly one second to swing from one side to the other.

For a mathematical pendulum, the period of oscillation is calculated by the formula: T = 2π √(L / g). If we take the length L = 1 meter and set the full period T = 2 seconds (so that it takes exactly one second for each half swing), the equation implies: g = π² (m/s²).

The definition of the meter was later changed: it was tied to one ten-millionth of the distance from the equator to the North Pole along the meridian passing through Paris. However, this geodetic definition was inspired by the earlier idea with the pendulum. Notably, both approaches match up with an accuracy of 1%. Essentially, since the old “pendulum” definition was the main candidate for a long time, values were adjusted so that the new meter was convenient and close to the measurements customary at that time.

It is also interesting that the number of seconds in a year roughly corresponds to the number of pi * 10^7. Earth’s orbital speed is about v = 30 km/s. The distance from the Sun to Earth is approximately r = 150,000,000 km. Thus, over a year, Earth travels a path of about d = 2 * π * r. Then, the orbital period equals T = d/v = π * 2 * r/v = π * 10⁷ seconds.