A Stroll Through Science and Architecture at Janelia Research Campus | June 13 2024, 18:19

Yesterday, I took a walk with my dog at Janelia Research Campus. It is a research institute located in Ashburn, managed by the Howard Hughes Medical Institute (HHMI). This is a place where scientists in the biotech field live and work, including Nobel laureates. Right here, in 2020, they created a detailed map of neural connections in the brain of a fruit fly, which was an important step towards understanding how neural networks function. But today, it’s about the images. The campus was designed by Rafael Viñoly, an Uruguayan architect (The super-thin residential skyscraper in New York is his work).

1700 panels of structural glass (bearing the weight of the building) from Saint-Gobain Glass, Belgium. It would be interesting to get inside—after all, the biotech theme is somewhat close to my heart. Overall, it’s all open, come in, walk wherever you want, but it’s still not customary here, and one should respect the openness.

Today, just some photos from the walk (mixed with a few from the net).

All these have been standing for almost 20 years now.

Seeking Insights on Slow Google Indexing and Zero Traffic for Two WordPress Sites | June 12 2024, 18:38

Could someone give me some free advice—just to better understand how this works? I have two sites, beinginamerica.com and raufaliev.com. Beinginamerica has 6000 posts (pages), and raufaliev.com has 4600. Both sites run on standard WordPress, the SaaS kind, not self-hosted. Essentially, you can’t configure much there, whether it’s custom, special, or accidental. You can’t even install Google Analytics. Everything has been up since mid-April.

Google indexes beinginamerica incredibly slowly. Currently, raufaliev.com has 320 pages in the index, while beinginamerica.com has 1700. Additionally, another 2.2K pages are marked as “having redirects, hence unindexed.” For example, “/2013/10/15/15-октября-2013-года-1058/”. There’s no redirect there. Meanwhile, 100% of URLs contain Russian letters, and it somehow works for the 1700.

Well okay, let it be 1700. But why then are there zero visits? I mean, statistically, it shouldn’t be zero, since it’s all unique content, not available elsewhere on the internet, and logically should be something Google finds showable from time to time, and someone should occasionally visit. But nobody does.

I don’t even need visitors. What would I do with this traffic—I have no ads there, and comments are deliberately disabled. I’m more interested in understanding how all this works, as I’m somewhat of an expert in this field.

Why does raufaliev.com have only 302 pages indexed and 47 not indexed. Why are all the rest ignored? Again, both sites are on the same platform. They both return the same headers. Unlike beinginamerica, raufaliev has no Cyrillic characters.

Who knows?

Apple Intelligence and batteries | June 11 2024, 14:38

In all this buzz about AI integrated into operating systems, what really doesn’t concern me is privacy. Rather, it’s the fact that overengineered software begins to devour hardware faster than the hardware can evolve, and eventually, I start contemplating a switch back to Linux, where things are much more transparent.

Just look at this. My Mac’s advertised battery life is 21 hours. In other words, you turn on your laptop at 8 AM, start streaming something from YouTube, and the battery should only run out by 5 AM the next day.

But in reality, that’s not what happens. Indeed, it does last significantly longer than any other laptops I’ve had before, but sometimes the battery drains in just a few hours. Why? That’s unclear.

Why? Because the OS, for example, might find an unindexed unpacked archive, and the corespotlightd process kicks off to index it. This process can’t be stopped—you can only turn it off forever, but then the search function won’t work. It’s possible to exclude indexing in Documents (which I’ve already done). But then another process wakes up due to some signal or schedule, and it too starts consuming the battery or CPU.

Still, it’s fair to say that this doesn’t really cause any major issues. Things run, they heat up the air, might be useless, but specifically the M3 Max never lags.

For instance, among the processes is the Apple Neural Engine Daemon (aned). It periodically wakes up and consumes resources. With new functionalities, such a process will awaken more frequently and use up more resources. Or something like com.apple.NRD.UpdateBrainService decides it needs to update some neural networks. And the more software you install on the computer, the more such instances you’ll encounter. Just Intellij Idea alone drains my battery and processor faster than anything else. I’ve made it a rule—when on battery, shut down Idea.

Ideally, having AI on a device should indeed heat the chip and drain the battery more actively, and most likely “just in case,” since not all users need all these AI features. I suspect that Apple will employ a trick: measuring battery performance without a configured iCloud and Apple Intelligence account, and we’ll see all those 21 hours of autonomous work. But as soon as the computer switches to working mode, it will need charging more often, and the office will be slightly warmer.

Andrey Anischenko | June 10 2024, 21:01

Such a great interview! Andrey is an amazing guy, very proud to know him and hoping it counts as friendship since around 2005 or so Andrey Anischenko

P.S. And, oh, when you decide to conquer North America, head north, ideally along the East Coast and preferably through Washington!

Everyone else – tune in, watch, Andrey really shares some interesting insights about the journey and the edtech market.

https://youtu.be/XWZ8f9RxUmw?si=bmnkUbSQzYISxnDT

How to Choose a Power Bank | June 09 2024, 01:43

If you’re planning to buy a power bank, here’s a lifehack to get a better one for the same money:

Firstly, pay attention to the type of battery – Li-ion or polymer Li-ion (LiPo). The latter have a higher energy density (yielding more charge for the same weight), and they are safer.

Secondly, look at the ratio between weight and stated capacity. For example, mine weighs 436 grams and is labeled 40000 mAh at 3.7V. Convert this into watt-hours by multiplying 40000 mAh by 3.7, which gives us 148 Wh, resulting in 148/0.436 = 339 Wh/kg.

The thing is, such a density in batteries does not exist. For LiPo, the range is 150-250 Wh/kg. And 250 Wh/kg is for the most advanced, expensive types.

For a battery weighing 0.436 kg and a voltage of 3.7 V, the capacity range would be approximately from 18000 mAh to 30000 mAh with an energy density of 150-250 Wh/kg. It’s more likely between 20000 mAh to 25000 mAh. Which is quite good, but definitely not the 40000 mAh as listed.

In other words, take the device’s weight in grams, multiply it by a number between 40 and 65, and you get a very probable real capacity in mAh. I would use 45 for certainty. But it’s very likely that you should use 40 if you’re also trying to get it for the lowest price.

Next, look at the charging time. On mine, it’s written that with a 30W power source, it should take 6 hours to charge from 0 to 100%. Typically, it’s about 9 volts (common for 30W), though it’s not specified. So, the average charging current would be 3.33A (30/9). The battery capacity can be calculated by multiplying the current (3.33) by the time in hours (6). That gives us 3.33*6=19.98A*h=19980 mAh. This is another hint that the battery is nowhere near 40000 mAh, but rather around 20000 mAh.

Is 20000 mAh a lot? The instructions, and on the battery itself, state that for the USB-C port, the charging current is 3.1A at 5V, 2.22A at 9V, 1.66A at 12V, and the same for the Lightning port at 5V. If all this holds true, a full charge from 0% to 100% of my iPhone 15 Pro Max with a 4400 mAh battery should take a maximum of 2 hours. You could charge 4.5 phones like mine with this power bank, or it should nearly fully charge a laptop (70Wh) to 100% and then be depleted. So at first glance, even 20000 mAh is not bad.

Digital Sleuthing: Extracting Artist Names from a Book Using Technology | May 31 2024, 01:50

How convenient it has become to work with books nowadays. On Saturday, Alla Prima II by artist Richard Schmid will arrive for me. But even before the purchase, I couldn’t resist and found a 500MB PDF version of the book online, and have already read 50 pages. And then I thought, what if I wanted to extract all the mentioned artists in the book, could I do it?

It turned out to be quite simple.

1) Split the PDF into individual pages using pdfseparate . This resulted in 332 PDFs totaling 472 MB. It takes a few minutes.

2) Convert the individual PDFs to JPG using pdftoppm -jpeg . This resulted in 332 JPGs. It takes a few minutes.

3) Recognize the text using tesseract . This process takes about 10 minutes.

4) Pass each page’s text to the local llama3, and request it to extract the names of artists from the text of each of the 332 pages (i.e., 332 requests). On my Mac, this took 12 minutes. In the end, I got 953 lines.

Llama3 is a bit slow, but overall it does reasonably well. It generates a lot of “noise” also like “Based on the provided text, here are the extracted names of painters” or “I’m happy to help!”. The output text after processing 332 pages is small, only 953 lines. We sort it, remove duplicates (resulted in 556). We remove all more than three words and fewer than two words through cat names.txt | awk ‘NF>=2 && NF<=4’. Ended up with 139 lines. Among them, there is still some noise, for example, names like “Cobalt Blue”, “What an interesting text!” and “Sherlock Holmes” were included as artist names. To clean them up, we use openai, which is smarter. We ask it to keep only artists and remove everything else. We got the list.

Alfred Sisley, Alphonse Mucha, Anders Zorn, Andrew Loomis, Anton Sterba, Antonio Mancini, Arthur Rackham, Berthe Morisot, Bill Mosby, Cecilia Beaux, Charles Hunter, Claude Monet, Dan Gerhartz, Dean Mitchell, Diego Velazquez, Donald Llanuza, Edmund Tarbell, Edouard Manet, Edouard Vuillard, Edward Atkinson Hornel, Eliot Goldfinger, Elizabeth Sparhawk-Jones, Frank Duveneck, Frank Vincent DuMond, Franz Hals, Frederic Remington, Gene Byrnes, George Bridgman, Georges Seurat, Gilbert Stuart, Giovanni Boldini, Grace Arnold, Hans Holbein, Harry Anderson, Heinrich Kley, Henri de Toulouse-Lautrec, Howard Pyle, Ilya Repin, Isaac Levitan, J. W. Waterhouse, J. C. Leyendecker, J.H. Vanderpoel, James M. Dunlop, Jean Dagnan-Bouveret, Jeremy Lipking, Jessie H. Vanderpoel, Joaquin Sorolla, John Gannam, John Singer Sargent, John Singleton Copley, John Twachtman, Katie Swatland, Marcus Thomas, Mary Cassatt, Michael Wilcox, N. C. Wyeth, Nancy Guzik, Nicolai Fechin, Norman Rockwell, Paolo Michetti, Paul Mullally, Peter Paul Rubens, Philip Andreevich Maliavin, Ralph Mayer, Richard Schmid, Robert Henri, Rose Frantzen, Scott Burdick, Shannon Two, Stephen Rogers Peck, Susan Lyon, Thomas Eakins, Thomas Wilmer Dewing, Valentin Serov, Vincent Van Gogh, Wayman Adams, William H. Mosby, William Harnett, William Merritt Chase

5) Now, we send this list back to openai and ask if there are any non-artists among these names. It turned out that all are fine, all of them are artists.

6) For reliability, we also ask openai if there are names in this “clean” list that were not in the original (“dirty”) list, to check if openai invented any artist names when asked to extract from the “dirty” list. Great, it did not invent any.

Voilà! In just over half an hour, I have a list of artists mentioned in the book.

7) We ask openai to categorize the list and provide a brief description

RESULT

Impressionists and Post-Impressionists

=================================

* Alfred Sisley – French impressionist, known for his landscapes.

* Claude Monet – Founder of Impressionism, famous for his water lilies and series of landscapes.

* Berthe Morisot – French impressionist, known for her portraits and scenes from everyday life.

* Edouard Manet – French artist, a precursor to Impressionism, known for his portraits and scenes from Parisian life.

* Georges Seurat – French post-impressionist, founder of pointillism.

* Vincent Van Gogh – Dutch post-impressionist, known for his vivid and emotional paintings.

Realism and Academic Painting

=============================

* Edouard Vuillard – French artist, member of the Nabis group, known for his interiors and figurative compositions.

* Anders Zorn – Swedish artist, known for his portraits and genre scenes.

* Cecilia Beaux – American portrait artist who worked in the style of academic realism.

* Diego Velazquez – Spanish Baroque artist, known for his portraits and historical scenes.

* Edmund Tarbell – American artist, known for his impressionistic portraits and genre scenes.

* Franz Hals – Dutch portrait artist, one of the greatest artists of Holland’s Golden Age.

* Frederic Remington – American artist and sculptor, specialized in scenes of the Wild West.

* Gilbert Stuart – American artist, known for his portraits, including the depiction of George Washington.

* Hans Holbein – German Renaissance artist, known for his portraits and religious paintings.

* Ilya Repin – Russian realistic artist, known for his historical and genre paintings.

* Isaac Levitan – Russian landscape artist, master of lyrical landscape.

* John Singer Sargent – American portrait artist, known for his exquisite and elegant portraits.

* Thomas Eakins – American realist, known for his portraits and genre scenes.

* Valentin Serov – Russian portrait artist, one of the leading representatives of the Russian realistic school.

Art Nouveau and Illustration

=================================

* Alphonse Mucha – Czech artist, a key figure in Art Nouveau, known for his decorative posters.

* Arthur Rackham – British illustrator, known for his magical and fantastical images.

* Howard Pyle – American illustrator, founder of the Brandywine School of illustration.

* J. C. Leyendecker – American illustrator, known for his advertising and magazine covers.

* N. C. Wyeth – American illustrator and artist, known for his works for classic adventure novels.

* Norman Rockwell – American artist and illustrator, known for his idealized images of American life.

* Frank Vincent DuMond – American artist and educator, known for his landscapes and genre scenes.

Portraitists and Genre Artists

=============================

* Cecilia Beaux – American portrait artist who worked in the style of academic realism.

* Diego Velazquez – Spanish Baroque artist, known for his portraits and historical scenes.

* Franz Hals – Dutch portrait artist, one of the greatest artists of Holland’s Golden Age.

* John Singer Sargent – American portrait artist, known for his exquisite and elegant portraits.

* Thomas Eakins – American realist, known for his portraits and genre scenes.

* Valentin Serov – Russian portrait artist, one of the leading representatives of the Russian realistic school.

* Jean Dagnan-Bouveret – French academic artist, known for his genre scenes and portraits.

* Joaquin Sorolla – Spanish artist, master of light, known for his portraits and beach scenes.

* John Singleton Copley – American colonial artist, known for his portraits.

Masters of Landscape

==============

* Alfred Sisley – French impressionist, known for his landscapes.

* Claude Monet – Founder of Impressionism, famous for his water lilies and series of landscapes.

* Isaac Levitan – Russian landscape artist, master of lyrical landscape.

* John Twachtman – American impressionist, known for his lyrical landscapes.

* Dan Gerhartz – Contemporary American artist, known for his atmospheric landscapes and figurative works.

* Dean Mitchell – American artist, known for his watercolor landscapes and urban scenes.

Contemporary Artists

====================

* Dan Gerhartz – Contemporary American artist, known for his atmospheric landscapes and figurative works.

* Dean Mitchell – American artist, known for his watercolor landscapes and urban scenes.

* Jeremy Lipking – Contemporary American artist, known for his portraits and figurative works.

* Nancy Guzik – Contemporary American artist, known for her lyrical and romantic figurative works.

* Richard Schmid – American artist and educator, master of figurative painting.

Other Famous Artists

========================

* J. W. Waterhouse – British artist, known for his paintings on mythological and literary themes.

* Paolo Michetti – Italian artist, worked in the genres of realism and symbolism.

* Peter Paul Rubens – Flemish Baroque artist, known for his allegorical and mythological paintings.

* Gilbert Stuart – American artist, known for his portraits, including the depiction of George Washington.

* Giovanni Boldini – Italian portrait artist, known for his elegant and dynamic portraits.

* Henri de Toulouse-Lautrec – French post-impressionist, known for his posters and scenes of Parisian life.

* Howard Pyle – American illustrator, founder of the Brandywine School of illustration.

* Franz Hals – Dutch portrait artist, one of the greatest artists of Holland’s Golden Age.

* Frederic Remington – American artist and sculptor, specialized in scenes of the Wild West.

* Hans Holbein – German Renaissance artist, known for his portraits and religious paintings.

It’s amazing what you can now do with a computer and information!

LLMs Struggle with Simple Word Filtering Task | May 28 2024, 13:56

I recently encountered a task that no LLM can solve. It should be super simple for an LLM, but somehow they can’t manage it.

There’s a list of about 1000 words. I need to keep only the most functional words from it, like which, should, would, etc.

Request: I have a list of words: …. Select only 50 words from this list that are primarily functional and carry minimal meaning in the context of keyword searches (for example, which generate significant noise in the case of partial matches). Example – which, shall, very. Do not add any words not present on the list above. The resulting experienced list should contain only words, one word per line.

ChatGPT-4o: started outputting some words alphabetically, ending at the word asking. Thus, it did not even go past asking.

Google Gemini: began inventing words not in the list, despite clear instructions not to do so.

Google Gemini Pro: produced something, but again, invented words that weren’t on the list. Almost half invented.

Anthropic Claude also listed words alphabetically, and stopped at words starting with the letter d.

Mistral 8x7B Instruct also made up half.

In fact, no LLM has managed the task. And it’s about words, not mathematics.

https://pastebin.com/5B8w96au

Programming Quirks: Moon Phases and Easter Eggs in Code | May 26 2024, 11:19

Two fun stories about the daily life of programmers.

The first one:

Researchers (@maciejwolczyk, @CupiaBart) trained a neural network to play NetHack, an old role-playing game where everything is represented by text characters. It’s a very old RPG from the days (1987) when there were no normal user interfaces, and everything happened in the console. The player goes through levels, collects items and rewards, participates in battles, and scores points — all expressed through the simplest characters.

In general, they trained it. The model consistently scored 5000 points. However, suddenly something went wrong — the model started scoring only 3000 points. That is, it showed a significantly worse result. Debugging solutions is always fun, so the thread author tried:

— to find a problem in the agent model loading code

— to roll back the code a few days ago

— to roll back the code several weeks ago (well, surely everything works there?)

— to rebuild the environment

— to change the version of CUDA (drivers for running neural networks on a video card)

— to run the code on a personal laptop, not a server

Nothing helped — the model consistently showed 3000 points.

In desperation, the author wrote to the creator of the model @JensTuyls, and received an unexpected response:

— Maybe it’s a full moon today 🌕

What?? 😑

Upon checking the lunar calendar, it turned out that indeed it was a full moon that day. The author launched the game and saw the message: “You’re lucky! It’s a full moon today.”

In NetHack, there is a mechanic that changes the gameplay during a full moon, based on the system time. The character becomes luckier, werewolves appear in their beastly guise, and dogs start howling. The model was not trained with data from the full moon, so its scores dropped to 3000 points. By changing the system time, the author confirmed that the model again scored 5000 points.

This did not make the game more difficult, but the model simply did not understand how the rules had changed and tried to play as usual — hence the drop in points. To check, you can change the time on your computer — and the model again scores 5000 points.

Moral: When faced with an unexpected error, don’t forget to check the lunar calendar.

* * *

The second story is about the ‘man’ command in the console.

This is a command that outputs documentation about what is entered as the second parameter. For example, “man ls” provides documentation on ls, which shows a list of files and subdirectories of the current directory, and “man man” provides documentation about itself.

On StackExchange, someone was wondering why their tests were failing. Answers

Marnanel Thurman:

Uh, that’s my fault, I suggested it. Sorry. Almost the entire story is outlined in the commit. The programmer maintaining the ‘man’, a good friend of mine, and one day, six years ago, I jokingly told him that if you called ‘man’ after midnight, it should print “gimme gimme gimme”, because of the Abba song “Gimme gimme gimme a man after midnight”:

Well, he actually added it. It was fun for someone to discover this, and we mostly forgot about it until today.

I can’t speak for Cola, of course, but I never expected that it would ever cause any issues: what test would break down parsing man output with no specified page? I suppose I shouldn’t have been surprised that such a test was eventually found, but it took six years.”