Exploring the Evolution of Computational Libraries and the Persistence of Fortran in Modern Algorithms | February 16 2025, 21:02

Today, I am delving into ML algorithms and was surprised to learn that the numpy library used to depend on Fortran code (BLAS/LAPACK) until recently, but now checking, they have switched to OpenBLAS, which no longer uses Fortran. Meanwhile, SciPy, a very popular library for scientific calculations (used in Scikit-Learn, which I’m currently studying, as well as in PyTorch, TensorFlow, Keras, etc.), still relies on Fortran 77 code. It utilizes ARPACK, for example:

https://github.com/scipy/scipy/tree/main/scipy/sparse/linalg/_eigen/arpack/ARPACK/SRC

BLAS and LAPACK, which still feature in OpenBLAS and many other places, were developed in the 1970s. For instance, BLAS is used in Apple Accelerate. Much hasn’t changed since 1979 because it’s all pure mathematics, why change it. LAPACK emerged a bit later, in the 1980s. ARPACK, mentioned above, followed later in 1992. Python libraries also extensively employ Fourier analysis, and here we have the FFTPACK library on Fortran 77. MINPACK, used for parameter optimization in ML, is actively utilized in SciPy and TensorFlow. From the 90s, a lot of code moved to C in modern frameworks. It was particularly interesting to look at Fortran, which is about 15 years older.

While I was figuring things out, I found that there is a Simulated Annealing algorithm, which is useful in problems where gradient methods perform poorly due to many local minima.

Imagine needing to find the largest mushroom in a forest. In this forest, mushrooms of various sizes grow at every step, and you can move in any direction, comparing them. But how do you choose a strategy to avoid sticking to just a “large” mushroom if there is an even bigger one growing somewhere further?

If you stop at the first big mushroom, you might miss the real giant. But if you keep wandering the forest, comparing every mushroom, you might never finish your search. Simulated Annealing helps find a balance: initially, you explore the forest freely, trying different directions, even if you come across smaller mushrooms. Over time, your steps become more cautious, and you increasingly refuse worse options. Eventually, this leads you to the largest mushroom in the forest.

So, it turns out this algorithm was created in 1953, and it remains almost unchanged in SciPy, and generally in machine learning, statistics, pattern recognition, logistics, although, of course, the modern menu of options for such tasks is much wider. The algorithm was originally devised to model the motion of atoms in molten metals. Metal, when heated, becomes liquid, and as it cools slowly, its atoms gradually find the perfect arrangement. If cooled too quickly, the material becomes non-uniform.

What did the scientists do? They devised a method of random changes in the model of atoms. Sometimes they accepted worse changes to avoid getting stuck in an “unsuccessful” structure. This led to the inception of the Metropolis Method – a key component of Simulated Annealing. The algorithm was created for physics, but then mathematicians (heh) got it and started using it in optimization.

Share this:

Related

Leave a comment Cancel reply