Navigating the Lexical Complexity of Nabokov’s “Lolita” | April 02 2026, 15:56

I’ve finished the first version of a dictionary-style book on Nabokov’s “Lolita”. The chart shows how the complexity of vocabulary is distributed across the pages of the book. The lower chart averages 25 sentences, displaying the number of complex words on the vertical axis, with colors indicating their complexity/rarity (purple – the most complex, red – less complex, yellow – even less so). But I have already removed two levels, and overall, for a foreigner, all five levels are challenging. In the book, level 3 is marked with a dashed line, level 4 with a simple frame, and level 5 with a double frame. Currently, there are 5794 words, of which 541 are fifth level, 1070 are fourth, 1883 are third, 1393 are second, and 54 are first (the simplest ones). Considering that the first version ended up being 1148 pages, the dictionary will need to be significantly streamlined by removing what can be dispensed with. This mainly pertains to the first and second levels, and some from the third and fourth. The rarity of words is calculated in three ways: through LLM, and through two lists of word frequencies in the English language corpus (300K words).

Not all words are complex. For instance, in the sentence “With the ebb of lust, an ashen sense of awfulness, abetted by the realistic drabness of a gray neuralgic day, crept over me and hummed within my temples.” someone well-acquainted with English might not know the words ebb, abet, drabness, while everything else is familiar, but lower the requirements for the reader, and the dictionary might not be very useful for such cases.

Or consider the sentence:

Homo pollex of science, with all its many sub-species and forms; the modest soldier, spic and span, quietly waiting, quietly conscious of khaki’s viatric appeal; the schoolboy wishing to go two blocks; the killer wishing to go two thousand miles; the mysterious, nervous, elderly gent, with brand-new suitcase and clipped mustache; a trio of optimistic Mexicans; the college student displaying the grime of vacational outdoor work as proudly as the name of the famous college arching across the front of his sweatshirt; the desperate lady whose battery has just died on her; the clean-cut, glossy-haired, shifty-eyed, white-faced young beasts in loud shirts and coats, vigorously, almost priapically thrusting out tense thumbs to tempt lone women or sadsack salesmen with fancy cravings.

My browser even highlights four words here.

I have definitions of words in English, German, French, and Russian. I’ve encountered the issue that different words from the text are considered complex in different languages, yet they are unified for me. So, I’ll have to mark, for example, French words in the English text separately, so they are not included in the French version, since there, the reader knows, for instance, what quel mot means.

Overall, this weekend I’ll be manually removing about half, and then I can make the cover and list it on Amazon.

Leave a comment