Exploring the Intriguing Origins of Words | April 09 2025, 03:51

Well, shall we continue with the fascinating etymology? I’ve been writing scripts for processing an etymological dictionary, and I’m finding all sorts of interesting stuff.

It turns out that the word “ciao” comes from the word “slave”. It derives from the Venetian expression s-ciào vostro or s-ciào su, which literally means “(I am) your slave”. The Venetian word for “slave” — s-ciào [ˈstʃao] or s-ciàvo — comes from the medieval Latin sclavus, which, in turn, was borrowed from medieval Greek Σκλάβος (“sklavos”), itself related to the ethnonym “Slavs”, as most of the slaves during that time came from the Balkans.

Also, it was a revelation to me that the words Kubernetes, governor, and cybernetics are etymologically related. They all derive from κυβερνήτης (kubernḗtēs) — “helmsman, one who steers a ship”. Consequently, governor came through Latin and Romance languages, cybernetics as a scientific loan through French, and Kubernetes as a direct calque from Ancient Greek, via Latin transliteration.

The words fuel and focus originate from the same Latin word focus (“hearth”). Focus was actually coined by Johannes Kepler, who used it as a geometric term for ellipses: “the point where rays converge”.

The words Madeira, mata, mater, matrix, matter, and mother are related and all trace back to the same Proto-Indo-European root *méh₂tēr — “mother”.

The words madam and madonna come from the Latin mea domina — “my lady”.

It’s hard to imagine, but the words merry (cheerful) and brief (short) originate from the same Proto-Indo-European root *mréǵʰus, which means “short”.

The words lobby and leaf also have a common origin — both stem from the ancient Germanic *laubą or its derivatives, related to foliage, leafy shelters, and coverings. In old buildings, laubia/lobby was a covered gallery or arbor, literally a shelter made of leaves. Thus, “lobby originally meant “leafy shelter” or “leafy arbor”.

Common origins or roots also link names like Yuri and George, Étienne and Stephen/Steven, William and Guillermo, Zeus and Jupiter, Zhenya and Yana, Joel and Elijah, Hansel and John, as well as Agnes, Nancy, and Inez, Diego and Jacob, Dorothy and Theodore, and Isabel, Elizabeth, and Lisa, Iskander and Alexander, Patroclus and Cleopatra. Many of these essentially denote the same thing, just modified differently across cultures.

Read more of such good stuff by clicking here –> #RaufLikesEtymology

Exploring Linguistic Connections with #RaufLikesEtymology | April 08 2025, 16:22

I continue with etymological curiosities. This is my third consecutive post, #RaufLikesEtymology. It all started when I stumbled upon an etymological dictionary and began processing it programmatically, extracting all sorts of things.

It turns out that the words “жёлтый” (“yellow”), “зелёный” (“green”), and “золото” (“gold”) share a common Indo-European root related to brightness and luster — *gьltъ, which in English, for instance, became the basis for both gold and yellow. In German, “gelb” (yellow) comes from there too. In Russia, “желтый” has been known since the 13th century as a nickname, and as an adjective in written sources only since the 14th century.

It turned out that “известь” (“lime”) and “асбест” (“asbestos”) come from the same word, the Greek ἄσβεστος.

It turns out that the words шифр (“cipher”), цифра (“digit”), and zero all come from the same word — the Arabic صِفْر (ṣifr, “nothing, zero”), which itself is a calque from Sanskrit शून्य (śūnya, “emptiness, nothing”).

Pushkin wrote in “Poltava”: “In the night’s darkness they, like thieves… // Craft the ciphers of universals…” “Universals” in the Ukrainian language of those days were called Hetman’s edicts, and “цифр” back then meant what we now call a cipher — “secret writing”.

Interestingly, the word “кантон” (Switzerland consists of 26 cantons) – originates from Chinese, from Guangdong.

It turned out that grotto and crypt — come from the same word, Latin grupta/crypta. Well, about Saturday and sabbath everyone knows (that they are one word by origin).

The Russian word “колесо (wheel) and the Indian “чакра (chakra) are linked by origin — both come from the same ancient root in Proto-Indo-European — *kʷékʷlos — “circle”, “wheel”, “rotating”. “Колесо came through the Slavic branch, while “чакра — through the Indian (Vedic-Sanskrit) branch.

The words cloak (“cloak”) and clock (“clock”) derive from medieval Latin clocca — “bell”, but entered English differently. Cloak arrived in the 13th century through French cloque, which meant both “cloak” and “bell” — due to the shape of the garment. Clock appeared later through Dutch clocke, denoting a church bell that marks the time; subsequently, it came to mean “clock”. The word bell (“bell”) already existed in English as a designation for a metallic ringing object, so there was no need to introduce another word for this.

The apricot has had a very interesting journey. Here, look at the attached picture. Borrowed in the early 18th century from Dutch, which itself had borrowed from Romance languages (for example, French abricot). It’s interesting to trace this word further: it turns out that in French, it came from Arabic, and in Arabic from Latin. Latin praecox meant “early-ripening”. Thus, praecox became abricot.

My little script churned out about 2 thousand examples from wiktionary. I pick the most interesting ones, but I think there’s enough material for about five more posts like this 🙂 Plus, I have more ideas on how to process to uncover even more interesting things.

Read more good stuff by clicking here –> #RaufLikesEtymology

Exploring Words with Distant Meanings Through Their Common Roots | April 07 2025, 16:32

I wrote a script that finds pairs of words connected by a common origin but have evolved to differ significantly in modern meaning.

I actually came up with this project an hour and a half ago, between meetings I threw together something using Python and ChatGPT, and here are the first results. Importantly, these results come not from ChatGPT, but from the script working with dictionaries.

For example, grammar – glamour. The word glamour originates from the Scottish pronunciation of the word grammar (meaning “knowledge,” especially magical). The early association of grammar with secret knowledge transformed into “glamour” as “magical enchantment.”

It turns out that Jack is a diminutive form of John, evolved through Jankin.

It turns out that espresso and sprain share a common root—the Latin exprimere, meaning “to press out, extract.”

debut and butt. They share a common root: Old French but—”goal.” Debut: from French débuter—”to start a game,” literally “to make the first strike at the goal.” Butt: in the sense of “target” (e.g. the butt of a joke), also from but—”goal, target.”

Technical details: What does the script do?

1. First, it downloads a vast array of data from the English Wiktionary (Kaikki) and a large language model FastText, which knows the “meaning” of words in the form of vectors.

2. Then it analyzes the etymology (origin) of words, finding their common “ancestors”—ancient words (etymons) from which the modern ones derive.

3. It then selects only those words that are full dictionary entries in Wiktionary and are commonly found in modern English (filtering out very rare or archaic words).

4. Then it measures the “distance” between meanings using word vectors (word embeddings) from FastText. By comparing these vectors, the script calculates how far the meanings of words with a common root have diverged. Low similarity in vectors indicates a significant difference in meaning.

5. It then finds “distant relatives”: Ultimately, the script searches for and displays pairs of common words that were once “relatives” but today their meanings are as distant from each other as possible.

The script still generates quite a lot of “noise,” but I have a clear idea of how to clean it up.

Read more of such goodness by clicking here –> #RaufLikesEtymology

The Unsung Contributors of Early Microsoft: The Lives of Monte Davidoff and Bob O’Rear | April 05 2025, 16:22

It’s intriguing how different people’s destinies unfold. Gates’ blog has published the source code for the original Altair Basic. Besides the well-known Gates (worth >$100 billion) and Allen (he passed away, but was around $20 billion), there appears the name Monte Davidoff, about whom very little is known.

Monte wrote all the “mathematics” with floating point for Microsoft Basic. It only lasted until version 4.0, after which, about a decade later, the IEEE 754 standard came along, and things changed slightly.

Since 2000, he has owned his consulting company, and its website (built in PHP) seems not to have changed since 2000 (though he did update the year to 2025 in the footer).

There are no photos of him online, almost no information about what he does, but there are two interviews, one in text, and another on Floppy days as a podcast. Apparently, he just quietly “tends to his own stove”.

Among the employees of the first Microsoft team—remember, the iconic photo?—there is Bob O’Rear, who held the position of chief mathematician. He played a key role in developing MS-DOS for the IBM PC. O’Rear left the company in 1993 and returned to Texas, where he took up cattle ranching on his own farm.

Global Names for the Same Melody | April 05 2025, 14:01

To my surprise, I discovered that our “Dog Waltz” is widely referred to here as “Shave and a haircut,” although in reality, Shave and a haircut is very well known as “knock! knockity-knock-knock… KNOCK-KNOCK!”.

I started digging. In Germany, Belgium, the Netherlands, and Norway, it’s known as the “Flea Waltz” (Flohwalzer). In Bulgaria, it’s called “Cat March” (Bulg. Котешки марш), in Finland — “Cat Polka” (Fin. Kissanpolkka), in Korea — “Cat Dance” (Kor. 고양이 춤 Koyangi Chum), in Japan — “I Stepped on a Cat” (Jpn. 猫踏んじゃった Neko-funjatta), in Mexico — “Little Monkeys” (Spa. Los Changuitos), in Hungary — “Donkey March” (Hun. Szamárinduló), in Majorca — “Polka of Fools” (Spa. Polca de los Tontos), in China — “March of Thieves” (Chi. simpl. 小偷进行曲, pinyin. Xiǎotōu jìnxíngqǔ), in Spain — “The Chocolate Pot” (Spa. La Chocolatera), in France and Poland — “Cutlets (Chops)” (Fr. Côtelettes, Pol. Kotlety), in Switzerland — “Cutlet Waltz” (Ger. Kotelett-Walzer), in Denmark — “Meatballs Escape Over the Fence” (Dan. Frikadellens flugt over plankeværket), in Sweden — “Kalle Johansson” (Swe. Kalle Johansson), and so forth.

The piece is in 4/4 time, by the way. So it is something like a polka or galop. However, in the movie “Gentlemen of Fortune,” it is just the triple meter version found here and here.

Revolutionary Surface Scanning Device Transforms Text and Texture into 3D Images | March 31 2025, 14:53

I’ve devised a new device that might become part of a future phone, or before that, a niche industrial and scientific tool. It works like this: you place it on any surface, say a paper with text, move it like a mouse, and end up with a 3D scan of the surface displayed on your screen. If there’s text, for example, it can be recognized, even if it’s inside an envelope. However, there probably are better industrial applications for such a device.

Technically: it uses a high-frequency ultrasonic sensor array (100–300 MHz) capable of distinguishing paper microreliefs and ink with up to 20-micron resolution—similar to what’s currently done in fingerprint scanners. A typical Qualcomm 3D Sonic Gen 2 piezo scanner measures 8×8 mm. The sensors have a resolution of up to 500 dpi. Motion data is collected from an IMU and an optical encoder (like in a mouse), to accurately stitch scan fragments into a unified image. It will work in darkness, with poor contrast, on semi-transparent paper, with zero dependence on lighting. It can detect hidden writings, fingerprints, or cleaned areas. Essentially, it will perform an in-depth analysis, down to detecting traces of pencil pressure.

Mysteries of Fungi: From House Invaders to Mind Controllers | March 30 2025, 13:45

A very meaningful, diverse, and captivatingly interesting episode—with Vishnevsky about mushrooms.

Three stories to whet your appetite. The first one is about the house fungus (Serpula lacrymans). It usually starts with a shed, a bathhouse, bridges, or a foundation, especially if it’s partially over water. The house fungus releases tough black mycelial cords (1-2 mm), which spread throughout the house within just a few days. Across the floors, walls, and floors—it’s like something out of sinister sci-fi movies. These cords reach any source of wood. The fungus begins to break down lignin and other components of the wood, and one of the by-products of this process is water. That is, the fungus only needs water at the beginning, and then, once it finds wood, it extracts water on its own, feeding and hydrating itself. Therefore, it is practically impossible to get rid of it. It is tenacious, fast-growing, and extremely destructive. It is capable of turning up to 50% of the wood volume it settles on to dust within a year. That’s why sleepers and footbridges at stations are made not from wood, but from concrete, even where wood is cheaper and despite the fact that wooden sleepers are superior in other properties to concrete ones.

The second story is about “witch’s circles.” Surely you’ve noticed that mushrooms often grow in rings on lawns or at the edges of forests, sometimes tens of meters in diameter. It turns out that the mycelium from the point where it originated transforms into a “donut,” which grows because the inner parts of this donut die off since it has already consumed everything there, while the outer parts continue to expand because there’s still something there. And thus, the mushrooms—the fruiting bodies—grow along this donut. Since the rate of spread is more or less the same, it appears as a perfect circle. Of course, unless it runs into something along the way.

The third is about cordyceps, which infects simple crawling organisms and controls them. Apart from being an interesting fungus on its own, the most expensive mushroom in the world is also a cordyceps (the Chinese variety). But now, about the one that parasitizes ants—you’ve probably heard of it.

It all starts with the fungus penetrating an ant’s body and gradually taking control over its nervous system. When the time comes, cordyceps “tells the ant that it is time to leave its native anthill. If it resists, the fungus employs chemistry: it not only biochemically influences the behavior, but literally “owns the ant. Moreover, it does so not bluntly, but very intricately—with precision to the details.

It entwines the muscles and nerve nodes, blocking any alternate movement. The ant begins to move along a specific trajectory—it climbs a plant, selects a suitable leaf, often one that hangs right above the anthill. It climbs to the underside of the leaf to prevent the sun from drying out its body and the future fungus. Then it moves strictly along the central vein of the leaf—as if along a highway.

When it reaches the middle of this vein, the fungus gives two last commands: 1) Clench the veins with its limbs as tightly as possible and 2) Bite through the vein with its jaws, securing itself definitively.

After this—rapid mycelial growth, the ant dies. From its head, now hanging downwards, begins to sprout the fruiting body of the fungus—a thin “needle, directed straight down over the anthill. When it matures, spores start to pour out of it, like from a shower, directly onto the ants passing below. Everything is calculated perfectly.

Scientists have spent decades trying to understand the “combat chemistry of Cordyceps. It seemed something incredibly complex must be at work. But as it turns out—on the contrary. Everything is simple: relatively primitive hydrocarbons are acting, structurally very similar to… gasoline.

If you take, for example, a bucket of gasoline, come to a forest anthill (especially a large one of red forest ants), stir it up a bit—you will see how the ants start to massively leave the dwelling, climb up the tree, cling to the bark, freezing in strange poses. Then they are released. But with Cordyceps, it’s the same, just with an additive: its hydrocarbons are slightly more complex, and “releasing” is no longer possible.

This is the bug in the ant’s firmware. It’s not some kind of remote control, not a command center. Just a chemical, and the ant “knows what to do. These aren’t random actions, but strictly defined, programmed within it reactions. Under certain substances, it behaves in a strictly defined way.

I recommend listening to it, Vishnevsky is very cool in this topic and it seems inexhaustible.

https://www.youtube.com/watch?v=ulQyUHsBaa4

Soviet Space Satire: Rescue at Mars and Beyond | March 28 2025, 01:14

I finally got around to a Soviet movie from 1959 showing a rocket landing on a floating platform at the end. The film is quite amusing. It features valiant Soviet cosmonauts rescuing hapless and vile American astronauts who got lost on their way to Mars. By the way, the cosmonauts are dressed in jackets and ties.

The plot goes like this. A two-man crew, under the mandate of science and the communist party, is sent to Mars for strictly scientific purposes. In orbit, the “space shuttle” docks at the station (at the beginning, the chief developer says it hangs above the Earth at tens of thousands of kilometers), docking to prepare for the “final jump” to Mars. Suddenly, a request comes from the American colleagues to accept the “Typhoon” Shuttle at the station. Could our most humane and friendly cosmonauts deny their colleagues, even if they are damned capitalists? During a friendly banquet, the “dumb Yankee”, apparently having had one too many, blurts out about the goals of his project. Much to the surprise of the gracious hosts who did not expect such audacity from their guests, it turns out the goal is Mars, of course, but purely for commercial, acquisitive reasons, such as trading Martian plots, for example. The head of the Soviet expedition, obviously caught off-guard… also having taken one too many, responds admitting similar plans but exclusively in the name of science. The crafty Yankee, after taking some Alka-Seltzer, rats out to his leadership. The American leadership, driven by predatory bourgeois interests, orders an immediate start to Mars, despite the unfavorable astrophysical weather conditions, thereby endangering the most valuable thing – the lives of cosmonauts. Covertly, “under the cover of night”, while the hosts are knocked out, the treacherous Americans weigh anchor. Consequences soon follow; they run out of fuel and are blown towards the Sun, with the expected outcome. SOS! The foolish “Yankee” frantically signals, bathed in snot and tears. Calm and strong Soviet guys in their powerful rocket “Rodina” rush to the rescue and indeed tow the doomed spacecraft, but precious fuel is spent maneuvering, the Americans abandon their junk and transfer to “Rodina”. There’s Mars, its seas and canals already visible, but catastrophically short on fuel. Fortunately, an asteroid named Icarus is passing by, and our brave cosmonauts asteroid-hitch a ride on it. An emergency launch of a cargo spacecraft with fuel follows, but it crashes on approach. It is decided to send another piloted ship because what’s most valuable is human life and friendship. This time, all goes well, the rescued crew lands directly on the floating platform near Yalta, anticipating the pathetic plagiarism with “Falcon”. A crowd with flowers and red banners, pioneers in red scarves warmly welcome the international comical collective (I could not have written that, it’s all pasha_popolam).

Three years later, this propaganda flick caught attention in the USA and was re-edited under the name “Battle Beyond the Sun”. Directed by Roger Corman, assistant producer Jack Hill, and young student Francis Coppola – that’s the kind of films he grew up on! The budding director re-edited and redubbed the film, removing all “anti-American propaganda”, Cyrillic inscriptions, and filmed an additional scene of a battle between two Martian monsters – how could he not. The timeline in the film was shifted to the future, after Earth had suffered a nuclear conflict and was divided into two superpowers – “Northern Hemis” and “Southern Hemis”, located on their respective hemispheres. Coppola also shot several scenes of the battle between two space monsters, one symbolizing a phallus and the other a vagina, and inserted them into Soviet material. These scenes were filmed in a Hollywood studio. Coppola and Hill also filmed scenes from the Rose Parade in Pasadena.

The names of not only Soviet characters but also actors, as well as names in the credits were changed to American ones to mask the film’s origins. For example, Alexander Shvorin and Ivan Pereverzev became “Andy Stuart” and “Edd Perry”, and the directors Mikhail Karyukov and Alexander Kozyr became “Maurice Kaplan” and “Arthur Corwin” – and were demoted to assistant directors. The director of the film in promotional materials and the final version is listed as a certain Thomas Colchart; sources differ on who actually hides behind this name (Karyukov, Kozyr, Coppola, or an American dubbing director).

The entire episode from “The Heavens Call” about the flight from Earth to the orbital station with minimal changes was included in Stanley Kubrick’s “2001: A Space Odyssey”. Kubrick’s film also included a scene with a video phone call to Earth. The orbital station in Kubrick’s film was copied almost exactly from “The Heavens Call”.

Separately funny, the USSR named the American spacecraft Typhoon – Тайфун. In the USA the word Typhoon is called Hurricane, since typhoon names hurricanes happening around Japan, and understandably in 1959, maybe one out of a hundred Americans knew the word 😉

Links to the original and the pale American copy — in the comments

Navigating Job Interviews in the LLM and ML Industry | March 22 2025, 14:05

Mimansa Jaiswal shared her experience of interviewing for researcher/engineer positions in the LLM/machine learning (ML) field last fall. Over 200 applications, 100 interviews, numerous rejections, and several offers—she decided to outline the entire process, as well as the resources she used. It’s extremely beneficial material, especially for those looking for a job in this field.

Link in the comments.

Summary (TLDR):

Startups:

Interview processes are unique and depend on the company’s development stage. Candidates may face 5–6 stages, including programming tasks (often from Leetcode), ML coding, testing fundamental ML knowledge, and cultural fit interviews. Startups may also require face-to-face interviews, multi-day work assignments, or extensive presentations. Processes are less standardized, and roles often include a wide range of responsibilities.

“Unicorns” (e.g., Anthropic, OpenAI, Scale AI):

More structured processes, but still vary from company to company. Candidates face interviews on programming (not always Leetcode-based), ML design, discussions related to LLM, and presentations. The number of stages can be substantial, especially when applying to multiple teams simultaneously.

Large tech companies (e.g., Meta, Amazon, Apple, Google, Microsoft):

Rigid and structured processes, often lasting from 1.5 to 2.5 months. Expect Leetcode-style interviews, ML system design, LLM research design, presentations, and behavioral interviews. Questions can be both general and role-specific.

Main interview components:

Programming tasks: knowledge of data structures and algorithms is tested, practice on Leetcode is necessary.

ML system design: evaluates understanding of system architecture and ability to develop solutions.

Presentations: candidates may present their previous work or research, demonstrating professionalism and communication skills.

Behavioral interviews: assess compatibility with corporate culture and approach to problem-solving.

Key differences by company type:

Startups are less predictable and may prefer candidates ready to take on diverse tasks. “Unicorns” look for specialists with narrow and current skills. Large tech companies adhere to formalized multi-stage processes and assess a broad spectrum of technical and soft skills. Each type of company has its unique demands and offers different opportunities, so it’s crucial to tailor preparation to the specific format.

Expected timelines:

The process can take from several weeks to several months, with possible delays during holidays or peak hiring seasons. Offers often require a quick response—usually within 7 days—requiring the ability to make swift decisions or negotiate a delay. It’s important to strategically plan overlapping processes and manage multiple timelines simultaneously.