If you see an article that language X is faster than language Y, you can close the article.





With my humanitarian brains, I always thought this way - if a programmer knows how to make it more performance, then it’s necessary to make it more performance. A productive solution = the right solution. One programming language can be slower than another, and if it turns out, the programming language goes to waste.



Well, for sure - if the developer is a performance specialist, he will drown for all these things, even if they are wrong.



Naturally, all this is nonsense, but it's not for me to tell you about it. Therefore, Andrey Akinshin, a developer and mathematician, candidate of physical and mathematical sciences, maintainer of BenchmarkDotNet and perfolizer, author of the book Pro .NET Benchmarking and just a very, very cool engineer, came to our podcast.





Below are selected quotes.



It is impossible to foresee everything in benchmarks



A colleague of mine recently had the following. He was programming in the morning, everything was fine with him, everything worked quickly. At some point, everything started to stick - Rider is slow, IDEA, the browser - everything is slow. He could not understand in any way what was the matter? And then I realized. He worked on a black laptop that stood by the window. It was pretty cool in the morning, and the sun rose during the day, the laptop got very hot and went into thermal throttling.



He knows that there is such a thing, knows that the physical environment can affect the performance, and he quickly realized what was happening. He had a model in his head according to which the world works, and within this model he more or less quickly figured out what was going on.



That is, the most important skill that can be obtained in benchmarking is not knowing everything in absolutely all the details - all runtimes and all hardware. The main thing is understanding how you need to act to find the problem, preferably as quickly as possible with minimal effort.



I will give an analogy with languages. When you learn your first functional programming language, you need to slightly modify your attitude to the world - to understand the principles of functional programming, how you generally need to think. Then you take the next functional language X, and you already have these principles in your head. You watch a couple hello world and start writing too.



At the same time, you may not know some of the nuances of the language. You may not know how certain syntactic constructs work, but that doesn't bother you that much. You feel comfortable and write. Faced an incomprehensible behavior - read the manual, figured it out, the new fact easily entered your picture of the world, and you went further. And you will never learn all the nuances of all functional languages ​​in the world, but the general approach will remain in your head.



I believe that you need to reach a similar level in each area, and then go in breadth.



At some point in benchmarking, I focused specifically on the measurement accuracy, on the features of certain runtimes, a piece of iron, something else. Then I stopped discovering America every time, and all performance problems started to fall into the classes I already knew. And I went in breadth - in the direction of performance analysis: what to do with the numbers that we measured. And this is the area where I have not yet reached the brink of knowledge. Some things have already become clear to me, but there is still a lot of work ahead - to understand how to apply all this in practice, which formulas to use, which not to use, which approaches are good and which are not.



Benchmarking for the sake of benchmarking is not the best thing to do



There should always be some business performance requirements, you should always understand what you are striving for. If you don't have business requirements, then there is no point in performing performance either. Accordingly, when there are business requirements, you already begin to understand which approaches, at least by eye, you can use and which cannot. If not, you go, benchmark, check - which approaches fit into your requirements.



And when you have a set of algorithms, options for writing code, design and other things, and everything fits into the requirements, you already choose what will be more consistent with the rest of the project, which reflects your views on aesthetics, on how to write code correctly ...



Roughly speaking, if I have a maximum of 10 elements in a collection, and there are two options - write a simple algorithm for a cube or a very complex one for n * log n - I will write a simple one for a cube that will be clear to everyone, which will be easy to maintain and modify. Because I understand that he will never break through my performance limitations.



If you wrote a slow solution for a small dataset, and then used it for a large dataset, and it didn't have super-bad consequences (usually not) - well, let's go and fix it. But in the head there will be a model of how to avoid these mistakes in the future.



For example, you can put an assert at the very beginning of the method so that the number of elements in the collection does not exceed such and such a number. Then the next programmer who accidentally tries to use your method will immediately see an exception and will not use it. Such things come with experience.



There is another problem - volatile business requirements. They will definitely change - this is an axiom of our reality, there is no way to get away from this. With experience, you will be able to predict by eye where requirements may change, where it is worth laying down a good level of performance, where the load may increase.



While this intuition is not there, you can go through trial and error and see what happens.



You always have a tradeoff between performance and beauty



If you write as efficiently as possible, most likely, your code will be just terrible, disgusting - and even if you close your eyes to the aesthetics, it will be difficult to maintain, subtle bugs will constantly appear in it, because the architecture is bad, the code is bad, everything is bad.



I think you need to focus on current business requirements, and write the cleanest, most understandable, beautiful, maintainable code within them. And the moment it starts to press (or there is a feeling that it will start soon), then something has already changed.



And even if you always concentrate solely on performance, there is no such thing as perfectly optimized, maximally productive code. It means everything - they forgot about C #, forgot about all beautiful languages. And it is better to write in general in machine codes, because the Assembler is also limited in syntax. And if you immediately write on the bytes, then you will get a performance boost.



In some cases, the fastest code turns out to be the most beautiful, the most obvious, the most correct. But such tradeoffs inevitably arise in dozens and hundreds of small moments. Let's say there is such a thing as checking for an array bounds violation. You can agree that the runtime will take care of you to check the out of bounds of the array in all places, and if you turn to the minus first element, you will get an exception and you will not read from the left chunk of memory.



And for this confidence that you definitely never subtract from the wrong piece of memory - you pay with a small piece of performance. That is, we use performance as a resource in order to make the program more stable, understandable and maintainable.



Language has no such property as performance



If you see an article that X is faster than Y, you can close the article. Language is a mathematical abstraction. This is a set of rules according to which the program is compiled. It has no performance, it has no performance, it is something that exists in your head and is embodied in a text editor.



Performance is available for specific runtimes, environments, specific programs, and specific apishe. When you take all these factors into account, you can talk about performance. But there is a combinatorial explosion, and you cannot say that one code in this language is always faster than another code in this one, because new versions of hardware and runtimes are coming out. You will never go through all possible combinations of external factors in your life. The apish you use are fundamentally different.



For example, in a conditional language in the early stages of development, they implemented a method for sorting using a bubble. Well, I don't know - the guys wanted to roll out the release as soon as possible, wrote the simplest sorting they could do. You took it, used this method, and it turned out to be slower on big data than in another language where quicksort is done. Does this mean that you can talk about the performance of some languages? No. You can say that this particular apish from this language on this operating system, on this hardware, in these environments, works slower than another apish from another language in another environment. So you can say. But it will turn out to be a very long paragraph of text to formulate correctly.



Conventionally, we can say that C ++ is faster in most cases than JavaScript. But it would be more correct to say that C ++ programmers with good C ++ experience who write in C ++ will write a program that is likely to be faster than a JavaScript javascriptor will write something that will work in a browser.



But there are also many reservations here. But what if the dude who wrote in JavaScript says that it is not, and goes to some kind of WebAssembly there or something else to redo. Or find on GitHub a JavaScript superinterpreter-compiler that works with a very truncated subset of JS by three and a half syntax constructs, but produces super-fast native code.



And there, if you wish, you can write a code that will overtake C ++. Moreover, you can write your own JavaScript compiler, which will be designed to compile a single program and overtake the "pluses" in speed. And this is, in principle, a valid option.



Social pressure of a popular open source project



With the growth and popularity of projects comes a certain level of responsibility. But you don't really have obligations. This fact is not always easy to understand, especially when all sorts of people come to GitHub and say: “It doesn't work for me here! Fix it urgently! I really need this to work. Go and fix it! " Or a dude gets a job and I'm on vacation. Three or four days passed, I did not even see that he started something there. I'm resting somewhere, and the dude starts - “Why the hell aren't you answering me? What kind of community does this project have ?! You are generally all disgusting people, you have to do bad things with you! I wasted my time, wrote to you that you are wrong, and you are not doing anything about it at all, you have been ignoring me for four days! How is that possible ?! "



And the more popular the project, the more social pressure from people who believe that open source is a place where other people do your work for you for free. But actually it is not.



And now, when immunity appears against people who want something from you, then life becomes much easier. Now I get to BenchmarkDorNet when I have the time and mood to code. I know there are a lot of bugs there. They are mostly uncritical, and concern some kind of marginal cases - in such and such an environment with the latest preview of the fifth DotNet, something somewhere does not work. Well, okay, let it not work. When I'm in the mood, I'll go and fix it.



If other people need it, they can fix it themselves and send a pull request - I'll do a review when I have time and mood.






Watch the entire podcast here .



All Articles