Nature is not a repository, but a workshop. Once again about the similarities and differences between DNA and program code

Technological advances at the beginning of the 21st century, in particular, the decoding of the human genome and a general understanding of the principles of genome editing, quite naturally induce comparing synthetic biology with programming. Indeed, ontogeny and biochemistry are in many ways comparable to programmed processes, since they obey internal logic, are performed step by step, depend on the context, and respond to external interference ( edited ). It is tempting to compare the four-letter DNA code with the binary machine code.



Nevertheless, in this article we will assume that such analogies are more daring than accurate, and try to consider why DNA can be considered a substrate for full-fledged genetic programming, but in itself is quite far from the programming language and the language itself.



DNA is the template for protein synthesis and is ultimately designed to carry genetic material from generation to generation. Thus, the genetic code can be considered workable if it allows the carrier to leave numerous fertile offspring, which at the same time turns out to be no less or more viable than the representatives of the parental generation. This task is formulated quite broadly, therefore evolution, with all its success, is a " rich " undertaking and burdens its offspring with a huge base of inherited, commented out and mercilessly out of place code.



Synthetic biology, in turn, sets itself much more clearly defined goals than evolution. For example, the most serious area of ​​application of CRISPR technology is associated withantitumor developments, while the cancer cells themselves are the fruit of indiscriminate natural selection - selection supports them, since they manage to effectively and quickly leave offspring, as well as mimic healthy cells of the affected tissue.



The DNA code is more like a natural language than a programming language, as it is redundant, quickly accumulates errors, full of complex dependencies that are determined by the context of the development of an organism, and the harm or usefulness of these dependencies is not always obvious.



An example is widely known with sickle cell anemia, a hereditary disease, as a result of which the human red blood cell becomes irregular and looks more like a crescent moon than a donut.







It is assumed that an irregularly shaped erythrocyte makes malaria more difficult and inconvenient for malaria plasmodia, which gives the carrier of this disease an extra chance to live to reproductive age, and only then die from a heart attack. Depending on the living conditions and the age of the individual, we have “both a bug and a feature in one codon”.



When “testing” such genetic modifications in vivo, natural selection was not constrained by timing and quality requirements, but rather developed under conditions, many of which can be compared with DDD... Continuing the analogy with the circulatory system, we can call the blue blood of cephalopods a subject-oriented solution. As a metal similar to iron, octopus blood contains copper. According to the latest research, this evolutionary find optimizes blood oxygen saturation in cold water and at low oxygen concentrations as such .



If we imagine testing real biotechnological developments in vivo the way a software code is tested, here extrapolation runs into obvious inconsistencies and difficulties, which, in particular, are mentioned in the article by Bruce Schneier and Larisa Rudenko:

Imagine a biotechnologist trying to increase the expression of a gene that allows blood cells to reproduce normally . Although the operation is quite simple by today's standards, it will almost certainly not be successful on the first try. In the case of software code, all the damage that would be done by such code is the crash of the program in which it is running. In biology, such an erroneous code could significantly increase the likelihood of a variety of leukemias and destroy vital cells of the immune system.
Also, the authors make the following important observation:

, «» , , , . «» , . , .
Likewise, it is very difficult to imagine a "cross-platform" genetic code that would work, for example, on Earth and Mars. DNA, a significant part of which is noncoding, obviously has significant information redundancy, but at the same time, as a rule, it is not suitable for biochemical readjustment to work on other planets or even in conditions that are comfortable for extremophiles on Earth. Extremophiles, in turn, were able to survive on Earth in conditions close to those of Mars.



Thus, significant adaptation of the genetic code to fundamentally unfavorable conditions takes place only at the periphery of biochemistry, and typical terrestrial ecosystems are also destructive for most extremophiles .



It is interesting that even Stanislav Lem in his "Sum of Technology" touched upon the most important aspect of biological information - its most serious conditionality by the context of the organism's development:

( ), , . , «»… , ? , . ; ? , ; , ; , , . , , ; : .
Finally, it is known that the four nucleotides that make up the DNA molecule are not the only possible ones. Already created are synthetic nucleotides that increase the capacity of the genetic code, as well as a synthetic bacterium capable of producing an amino acid that is absent in other living organisms.



Accordingly, DNA can be partly comparable to machine code , which has already been written about on Habré, but it differs from the source code, first of all, by its redundancy, unpredictability and object-orientedness. Therefore, the phenomenon of Cello technology , which allows translating the source code into DNA nucleotide sequences, looks completely logical . Those interested can familiarize themselveswith the Cello Github repository (using the Verilog language ).



Thus, analogies between DNA and machine code are rather arbitrary, while analogies with source code are so far unconvincing. DNA is much more like a natural language for communication of a living organism with the environment. But the significant orderliness and extensibility of the DNA alphabet is quite conducive to the creation of a full-fledged programming language based on it, and, possibly, to the creation of compilers. Perhaps such a language will be comparable to DNA, as Java or Python are comparable to English, or it borrows syntax from DNA, but partially or completely changes the semantics of codons. In addition, given the above, a full-fledged biological programming language should have a self-healing function and, possibly, much more potential for reducing entropy than is inherent inbiological life. The genetic code implemented in the Earth's biosphere is extremely interesting as a carrier of information and, most likely, with some refinement and an increase in abstraction, it will be able to compete exactly with a low-level programming language.



It remains to live up to this.



All Articles