When I started delving into school-level biology from the perspective of a forty-year-old, I was hooked by something that perhaps wasn’t covered during my school years, and I am not sure about now. But it’s very intriguing.
It’s about how the body knows which cells to create and where. This is quite a non-trivial question, and there are so many layers of meanings and explanations that I don’t even know where to begin.
First, the basics, from the school curriculum. Look, our DNA contains only 19969 coding genes in DNA (plus another 63,494 non-coding DNA genes, which are also called “junk”). By the way, the Y-chromosome was sequenced just this year, but overall the human genome has been decoded. It is mostly not entirely clear what it does (at least not clear), but at least we have the source code. Essentially, it’s a 3.3 GB file.
So, each of these 20000 genes encodes proteins for various purposes, for example, like building material for 230-400 types of cells. Protein is a three-dimensional structure made up of amino acids. There are 26 different amino acids. It’s easier to imagine it as a tangled string with beads, where some beads of 26 types repel each other, others attract each other, in short, if something doesn’t work out, the whole protein can end up as a tangled string with beads, or it may not. As mentioned, there are 26 amino acids, and the sequence determines which protein will result. Actually, the “program” consists of “triplets” (a three-digit code), which are translated into amino acids, and these in turn create the protein ultimately churned out into the body and then put in its proper place. Interestingly, the genetic code is the same for all living beings on the planet (Well, almost, very very rare exceptions exist, but there’s a very slight difference). Also, it’s worth reminding that this program forks billions of times from the first day of life, and each of us has 30 trillion instances. With copying, errors also accumulate, so a significant part of these 30 trillion (probably all) are slightly modified versions relative to each other. Yep, it’s also worth noting that people differ by one percent of DNA. That is, 99% makes us such that we generally survive somehow, and 1% adds uniqueness—from facial features to the way of thinking.
So, why this introduction. The question is how such a complex organism can be programmed by such a simple program (remember, just 3GB of code, of which maybe 1% is actually functional, the rest is who knows what).
And here’s what I found out. Evolution made this program such that it changes itself in a specific, well-adjusted direction, allowing very little randomness in this process. Just imagine, you have a program that calculates and outputs the number pi. Then three bytes are removed from this program, and it starts to output the number e, then some more three bytes are removed, and it starts to produce the Planck constant, then another five bytes are removed, and there you have the constant g. A good analogy is a record. You play it—it plays one music. Then cleverly made scratches cause the needle to jump back and forth, and the music changes, but it’s still music, not noise. If you imagine the structure of the organism as a set of interactions within the cell, each producing protein and attaching it somewhere, then on every iteration N+1, this program undergoes small, nature-intended corrections through trial and error, and thus conventionally through about 100 generations, the cell begins to multiply proteins for conventionally the liver, and then after about 300 generations for conventionally the skin. These corrections are called gene expression factors. Effectively, these are various things floating in the cell, which either came from outside, or were born in previous generations. If it came from outside, it could even be from the environment, or they could be chemical signals from other parts of the organism. For example, a child is born to a mother, and obviously, the cells “know” about it. This mechanism really surprised me back in the day. How can you write a program that works differently over the organism’s lifetime depending on the results of previous iterations! But as we see from our experience, it somehow works, and even quite reliably.
If you draw an analogy with programming, it’s the same program, but instead of parameters, it takes a “mask,” filtering some of the instructions. The mask is selected such that the resulting program continues to operate and do something useful. It’s like with that example, having a program for generating the number pi, applied a mask, it turned into a program for generating the number e. Evolution tried different masks and those that resulted in useful progress survived to the next iteration, N+1. For multicellular beings, this still happened asynchronously and synchronously at the same time. Asynchronously, because cell A doesn’t know about the state of cell B and in which iteration it is. Both cells’ programs output something and progress to the next iteration. Synchronously because enzymes (consider part of mask A) may depend on the results of B’s work, so B must precede, thus there’s synchronicity. And this mechanism obviously often fails more than it works, but since it’s tested at the cellular level (we have 3 trillion of them) and at the level of a living unit (in the case of bacteria it’s a number not even countable, then unicellular, then multicellular, those are some wild numbers) and all this over millions of years, it is quite believable that random “masks” leading to advancement further to N+1 are not so rare. So gradually, that’s what happened.
On average, cells divide 52 times, then die (apoptosis) (exception – cancer cells) This is called the Hayflick limit and is related to telomeres—maybe you’ve heard about experiments to extend life through telomere restoration. We have three trillion cells, with different lifespans for different cells. DNA in each cell is its own, plus due to inevitable mutations slightly different than its neighbor. In the end, the program for fingers, toes, hands, ears there takes the form of a single event, which triggers symmetrically at the same time in some iteration. Subsequent iterations just finish what was started. If something malfunctions at the gene level, then problems arise immediately on both sides. And if with something critical, then in the following iterations it starts driving unexpected enzymes leading everything awry, mostly ending in miscarriage. In general, life is bustling 🙂
The second interesting thing is epigenetics. It’s exactly about the factors of gene expression. It is believed that when a child is born, what he gets are genes from dad and mom. In reality, he also receives some unique set of these external gene expression factors from the mother’s body. Moreover, the result of the DNA work of this child in the mother’s womb ends up in the mother’s body (though there is a placental barrier, but it’s overcome), and then participates in the processes of the next pregnancy, or even affects the mother’s body. How it participates is unclear, maybe it’s all on the edge of “nothing,” but the fact that in the second child’s body there are greetings from the first child’s body is a scientifically proven fact. Google fetal microchimerism.
In general, all this is wildly interesting. And it seems like a bottomless barrel of knowledge. If I were a school kid now, I would recommend not going into IT, but into life sciences. The most significant discoveries in the next 10-20 years, I believe, will be there. And yes, pretty soon they’ll connect the brain to AI. Everything is already there, just needs a little “fine-tuning” (about 20 years should be enough).

