Navigating the Lexical Complexity of Nabokov’s “Lolita” | April 02 2026, 15:56

I’ve finished the first version of a dictionary-style book on Nabokov’s “Lolita”. The chart shows how the complexity of vocabulary is distributed across the pages of the book. The lower chart averages 25 sentences, displaying the number of complex words on the vertical axis, with colors indicating their complexity/rarity (purple – the most complex, red – less complex, yellow – even less so). But I have already removed two levels, and overall, for a foreigner, all five levels are challenging. In the book, level 3 is marked with a dashed line, level 4 with a simple frame, and level 5 with a double frame. Currently, there are 5794 words, of which 541 are fifth level, 1070 are fourth, 1883 are third, 1393 are second, and 54 are first (the simplest ones). Considering that the first version ended up being 1148 pages, the dictionary will need to be significantly streamlined by removing what can be dispensed with. This mainly pertains to the first and second levels, and some from the third and fourth. The rarity of words is calculated in three ways: through LLM, and through two lists of word frequencies in the English language corpus (300K words).

Not all words are complex. For instance, in the sentence “With the ebb of lust, an ashen sense of awfulness, abetted by the realistic drabness of a gray neuralgic day, crept over me and hummed within my temples.” someone well-acquainted with English might not know the words ebb, abet, drabness, while everything else is familiar, but lower the requirements for the reader, and the dictionary might not be very useful for such cases.

Or consider the sentence:

Homo pollex of science, with all its many sub-species and forms; the modest soldier, spic and span, quietly waiting, quietly conscious of khaki’s viatric appeal; the schoolboy wishing to go two blocks; the killer wishing to go two thousand miles; the mysterious, nervous, elderly gent, with brand-new suitcase and clipped mustache; a trio of optimistic Mexicans; the college student displaying the grime of vacational outdoor work as proudly as the name of the famous college arching across the front of his sweatshirt; the desperate lady whose battery has just died on her; the clean-cut, glossy-haired, shifty-eyed, white-faced young beasts in loud shirts and coats, vigorously, almost priapically thrusting out tense thumbs to tempt lone women or sadsack salesmen with fancy cravings.

My browser even highlights four words here.

I have definitions of words in English, German, French, and Russian. I’ve encountered the issue that different words from the text are considered complex in different languages, yet they are unified for me. So, I’ll have to mark, for example, French words in the English text separately, so they are not included in the French version, since there, the reader knows, for instance, what quel mot means.

Overall, this weekend I’ll be manually removing about half, and then I can make the cover and list it on Amazon.

When the Night Lit Up: Unraveling the Mystery of a Superbolt Storm | March 21 2026, 12:55

We had a thunderstorm last night. The whole county is buzzing because everyone thinks that something exploded just before midnight. Several posts in a row on social media. In short, it was thunder. But a bit more rare than usual. Caused by a 401 kA lightning, dubbed the Wild House Shaker. A typical lightning strike is 30 kA. If the numbers are to be believed, 401 kA is really damn a lot. They will likely say we haven’t had such lightning here for decades.

Attaching an interesting map.

The points on the map show superbolts — lightning strikes with an energy of no less than 1M J. Red points — particularly powerful superbolts with an energy of more than 2M J. That is, superbolts mostly occur in the northeastern part of the Atlantic and in the Mediterranean Sea, and less frequently — in the Andes, off the coast of Japan, and near South Africa.

this is what the page from which I took the map says (translation):

“New work shows that superbolts most often occur over the Mediterranean Sea, the northeastern Atlantic, and over the Andes, as well as in smaller amounts to the east of Japan, in tropical oceans, and near the southern tip of Africa. Unlike regular lightning, superbolts often strike over water.

“Ninety percent of lightning occurs over land,” said Holzworth (that’s the main guy on lightning at the University of Washington).

“But superbolts mostly arise over water, right up to the coastline. For example, in the northeastern Atlantic, the distribution maps of superbolts clearly show the outlines of the coasts of Spain and England.”

“The average energy of a discharge over water is higher than over land—that we knew,” he said. “But we did not expect such a stark difference.”

The season for superbolts also does not match the usual patterns of lightning. Regular lightning most often occurs in the summer—the three main so-called “lightning chimneys” coincide with summer thunderstorms over America, Africa south of the Sahara, and Southeast Asia. However, superbolts, which are more common in the Northern Hemisphere, occur in both hemispheres from November to February.

The reason for such a distribution remains a mystery. In some years, there are significantly more superbolts than in others: the end of 2013 was record-breaking, and the end of 2014 was the second largest, while in other years such events were much less frequent.

“We speculate that this may be related to sunspots or cosmic rays, but we will leave that for future research,” said Holzworth.

“For now, we are just demonstrating that there is a previously unknown pattern.”

Smartfolio.me: Revolutionizing Knowledge Organization with Advanced Features | March 19 2026, 04:01

My creation – the knowledge organization tool Smartfolio.me – has gained new features. I’m attaching a five-minute video overview.

It’s like Google Docs, but you can embed documents within each other, creating a network of connected knowledge, and these documents can be PDFs and regular texts.

Upload a PDF, the program converts it into images, and you can highlight any sections right on the pages to leave a comment or ask a question.

If something in the text is unclear, you highlight the area and press “elaborate” — the LLM will detail everything thoroughly, taking into account the context of the entire document, and the explanation will stay linked to the highlighted fragment.

You can simply cut out a piece from a PDF, and the LLM extracts clean text or a ready-made formula from it.

In the PDF window, there is now a small panel — all comments and explanations are immediately visible there, so you can quickly jump to the necessary parts.

You can cut out a diagram or graph from a PDF, copy it as a picture, and paste it into your text. It will automatically crop “on the fly” and save in the database, not as a copy but as a link to the page with crop parameters.

If you delete the page link in the text, it won’t disappear completely but will go into a special list, from where you can reattach it somewhere else or delete it finally. The same document can be inserted in several places. If you add a comment to it, it updates everywhere where this document is linked.

Mathematics is fully supported — LaTeX formulas can be not only viewed but also clicked to adjust them in the editor.

You can generate formulas by description. Just write in words what formula you need (for example, “binomial distribution”), and the system itself outputs the ready formula code.

Now there is a system of plugins – essentially isolated experimental functions separate from the main program. For instance, there is a plugin that recursively collects all subpages into one long document — convenient if you need to read or print everything at once.

Or consider the “YouTube Transcript Cleaning” plugin. If there is a dirty lecture text from YouTube, the plugin will punctuate, paragraph, and create neat headers.

If you insert a link to a website, it opens in a column next to it — you can read the source and simultaneously take your notes. However, some websites do not allow embedding on foreign pages. The system recognizes such sites, and they open in a new tab.

The left panel with the list of pages can be hidden or resized with the mouse, so it doesn’t take up space on the screen.

You can simply copy and paste an image or screenshot, and it will not just insert, but also upload to the database.

It supports working from a mobile phone. On the phone, the interface switches to a single-column mode for convenient reading and commenting on the go.

Multiple databases are supported – you can switch between them. You can connect different databases and different LLMs and switch between them.

Crafting Nabokov’s Dictionary: A Multilingual Lexical Journey | March 15 2026, 18:30

I’m reading Nabokov and decided to take a break to create a convenient app “Nabokov’s Dictionary” and am considering selling it on Amazon as a book. Essentially, it looks like this (see screenshot) – definitions of complex words in English, Russian, German, and French, in the same order they appear in the original book.

Would you buy such a book?

To accurately make their definitions, I also wrote an aligner – a program that matches sentences and paragraphs in English with their translations (Nabokovian) into Russian. And when a word’s definition is created, it uses not only the knowledge of LLM but also the Russian translation by the author. It’s worth separately discussing how the algorithm works (I invented it myself because everything I found online did not work as I needed). It first finds long sentences and matches the longest sentences with their pair through cosine similarity of embedding vectors created through the multilingual e5 model. These sentences become anchors. Then, assuming that for long sentences the error is almost excluded, the longest sentence between anchors is found, and everything repeats recursively. There are many situations where a sentence in Russian has no equivalent in English and vice versa, where a sentence is split into two, or conversely two are merged into one. The algorithm handles this as best as it can. The result is quite a good quality of alignment. To such an extent, that errors in alignment can hardly be found (but they are likely still there). Either way, it is only needed for the context for translating words, even if there are rare errors, it’s not a big deal.

Would you buy such a book?

Seeking Alpha Testers for a Revolutionary Text and PDF Management Tool | March 03 2026, 03:02

Looking for alpha-testers. As part of R&D and for my own tasks, I wrote a productivity tool (I actually wrote about this in my last post, but Facebook said that because I put a link in the post, only 12% saw it). Now I want to check if it will be useful to anyone else. If the idea resonates with you — let me know, and I will share access.

Website smartfolio dot me. What’s the main idea?

It’s an online notebook for working with text and PDFs, organized as a graph. It looks like Google Docs, but there’s an important difference: you can attach “child” documents to specific parts of the main text to expand on details or clarify concepts. These “comments” themselves are full documents and can have their own nested branches.

If there’s a fragment in the text that is unclear, you can ask the system to explain it (this will require your Google Gemini API key).

The system uses the full context of the document to generate a response.

Explanations are permanently attached to a specific place in the text.

This is super convenient when reading complex scientific articles. For instance, you can highlight the authors’ surnames in a PDF and instantly get a background on them — the information will be attached right to that fragment on the page.

Typical workflow

Upload a complex text and read it right in the app from either a mobile or a computer. As you go, add manual or AI-generated notes to important or unclear sections for future reference.

I do not store your documents, PDFs, images, or API keys on my servers. All data is stored in Turso DB (SaaS, free up to 5 GB).

Screenshots on the website’s main page best describe the project.

How to try?

To register in the app, you need an invite code. Just write me in the comments or in a private message, and I will send it.

Website smartfolio-dot-me

Revolutionizing Research: Introducing a Web-Based Notebook Integrated with AI and PDF Support | February 19 2026, 16:19

I’ve further developed a new tool for myself for working with information and organizing it. The main idea is a web-based notebook for research, studying subjects, working on them, integrated with AI and PDF support.

The main problem with typical PDF readers and notes is that the context is lost as soon as you switch to a new tab. In my tool, each text fragment or PDF becomes a node in a “live” hypertext tree, which I can access from multiple computers at any time.

Work process:

– Contextual AI. I can ask the AI to clarify complex passages right within the document. The explanation stays right where the question was asked. Moreover, it is a separate document, linked to the specific spot in the source. When clicked, you see both the original and the explanation on the screen at the same time.

– Panels instead of windows. If the explanation itself requires clarification, a new panel opens to the right. This allows for an endless chain of queries, never losing the place in the original text. That is, you see several panels at once, and unnecessary ones can be closed.

– PDF support. I can upload a PDF, select an area on the page (e.g., a complex diagram or a list of authors), and the LLM instantly extracts data, supplements, or explains them. The explanation is attached to the spot where it was requested, just like with non-PDFs.

– Nested annotations. My comments are not just static text. They can contain their own PDFs, links, and further sub-tasks for AI, maintaining a depth of nesting that reflects how we actually think.

This is not just a file storage system, but an “engine” for building knowledge.

The tool suits me personally very well, but perhaps it only solves my specific tasks. What do you think, would something like this be useful to others? Would it be useful to you? Should I develop the project into a fully-fledged product and give it to other users for testing?

My Ambitious 2026 Plan: From Galapagos Travel to Academic Achievements and Creative Pursuits | January 20 2026, 04:44

My plan for 2026:

– Travel to the Galápagos Islands, Ecuador for a week (summer)

– Finish and release a book on Information Retrieval (also summer, progressing slowly, first couple of chapters are already written. Already spent about 50-100 hours on this, the easy part)

– Release at least one scientific paper, probably on Data Mining (spring). Ideally, submit it somewhere to a journal (challenging). Already spent about 30 hours on this topic, a lot left to do.

– Make a step towards a PhD. Find professors, visit universities, understand the cost and assess my capabilities and resources.

– Continue studying fundamental mathematics and not die (linear algebra, calculus, probability theory, statistics, classical ML). In 2025, I spent about 200-400 hours on this topic.

– Continue studying Deep Learning and reach the “can teach” level. In 2025, I spent about 100-200 hours on this topic.

– Continue studying Data Mining/NLP.

– Update my book on RecSys, releasing version 2.0 with updates and corrections (autumn 2026)

– Make noticeable progress in painting and playing the piano. Specifically, learn Schubert’s serenade (Ständchen, D 889) completely and create at least one canvas that I wouldn’t be ashamed to give as a gift.

Exploring ASML’s Advanced Chip-Making Equipment with Veritasium | January 02 2026, 00:47

Veritasium released a very cool report yesterday from ASML about the equipment used to print chips for your little phones, cameras, and laptops.

For those who aren’t familiar with the process. First, a monocrystal is grown from ultra-pure silicon and cut into thin wafers, then multiple layers of thin dielectrics, conductors, and semiconductors are repeatedly applied to the wafer surface, each time shaping the necessary areas using photolithography, etching, and ion doping, eventually creating billions of transistors and connecting metallic paths; finally, the wafer is tested, cut into individual crystals, and packaged into casings, making them into finished microchips.

This process had a limitation – the width of the paths and the distance to the next one are limited by the wavelength of the light used, and reducing it is difficult because there’s nothing to focus such a beam with – lenses simply absorb/reflect everything. In EUV lithography (extreme ultraviolet), the wavelength is 13.5 nm. This is virtually soft X-ray radiation.

The video explains details about the ASML machine costing 400 million dollars. Instead of refracting lenses, highly complex systems of reflecting mirrors are used. These mirrors are the smoothest surfaces ever created by humanity. If the mirror of this machine were enlarged to the size of the Earth, the largest bump on it would not be thicker than a playing card. To enable the mirrors to reflect X-rays, up to 76 alternating layers of tungsten and carbon, each less than a nanometer thick, are applied. All this is done by Zeiss. In addition, this mirror has a controlled curvature—it is constantly adjusted by robots with precision up to picoradians. The precision of the mirror control is so high that if a laser were mounted on it, directed at the Moon, the system could choose on which exact side of a 10-cent coin lying on the moon’s surface to hit with the beam.

But. We don’t have a “light bulb” that emits light in the EUV range.

To generate this light, a laser “shoots” at a droplet of molten tin the size of a white blood cell, traveling at 250 km/h. The first pulse flattens the droplet into a disc, the second and third turn this “disc” into plasma – and all this occurs within just 20 microseconds. When hit by the laser, the droplet heats up to 220,000 Kelvin — approximately 40 times hotter than the surface of the Sun. This plasma emits that very necessary light. And it does so 50,000 times a second. They say it’s been brought up to 100,000. Imagine, at a hundred thousand laser shots per second, it never misses a single one. All this happens in a deep vacuum. To clean the mirrors from tin particles, the chamber is constantly blown with hydrogen at a speed of 360 km/h — faster than a Category 5 hurricane. This process is described by the same formula (Taylor-von Neumann) that describes a nuclear explosion or supernova explosion.

The machine layers the chip with an error margin of no more than five atoms, while the matrix swings back and forth with an overload of 20G.

A single High-NA machine is transported in 250 containers on 25 trucks and seven Boeing 747 aircraft.

Link to the video – in the comments. Or search on YouTube on the channel veritasium.