A video tutorial on floating point arithmetic in IEEE-754 format. Part I

Floating point arithmetic is not well understood by all programmers. Previously, I worked in various IT firms and was surprised to find that even experienced programmers get lost when the task arises to select to compare two floating point numbers in code like this:ε



if (abs (a-b) < EPS) . . .
      
      





They naively chose the same 1e-8 number for all of their projects, creating a potential spot for severe errors. Moreover, they tried to compare two doubles like this:



if (a < b) . . .
      
      





not understanding why I scolded them when I saw such nonsense. I'm not even talking about the horror that a constant like 0x400921fb54442d18 (just a number ), which can be seen in some older programs or on the debugger screen.Ď€











Once in the process of debugging, a colleague discovered that the expression with double does not change when he tries to add two numbers, and then he began to sin on an error in the compiler or debugger, until I explained in what cases the sum of two numbers will remain equal to only one of the terms ...



Number permutation tricks are also scary for many, and on StackOverflow I have often seen bewildered questions from users about why a small change in an expression leads to different results and why code optimization can lead to a completely wrong answer. Once, one of the users pointedly stated that he found an error in the processor when he converted an integer from 64 bits to a double precision number, then back - and got another number. He apparently did not know how at least 11 bits are lost with such a conversion. And such ignorance is found even among those who are sure that he just knows how everything works.



Even on Habré, there are often articles in which the authors "expose" the incomprehensible behavior of floating point numbers and pass off their discovery as something new and unknown, unobvious and mysterious. It is strange to see such articles written by seemingly professional programmers. I will not publicize them, please look for them yourself.



It would seem that various training materials should solve the problem, but no. As a rule, the reader is frightened off by formulas and incomprehensible words, which the author immediately throws at him: normalized number , hidden unit , biased exponent- and then, without explaining the reasons why this or that solution appeared in the IEEE-754 Standard, a dry presentation of the theory begins. Video lectures on this topic also do not shine with variety: everyone seems to be analyzing the same primitive example, where everything turns out quickly and beautifully ... and the student will never guess that almost 100% of compilers work with floating arithmetic with errors. This is understandable, such teachers themselves do not know what is coming from and how it works, and therefore only tell what they themselves understand, and it is quite obvious that they themselves only yesterday discovered the world of floating point arithmetic, but they are already in a hurry confused tell about him. I don’t judge, but I think that such behavior in public is unacceptable.



So I decided to try to fix the situation and created in a sensetrialtraining course. This is a video course that smoothly immerses the viewer in the world of floating point arithmetic. The first four lessons we look at the decimal number system and how circumstances force us to create one or another system of numbers, eventually arriving at a floating point system in this form. Where does normalization come from and why is it needed? Where does the associativity problem come from? How and why is accuracy lost and what to do about it? What is absolutely forbidden to do and why? Why do denormalized numbers appear in such a system and what is it in general? Then the next 4 lessons show how all this knowledge fits nicely into binary arithmetic and where the idea of ​​the hidden bit comes from. When the first "terrible formula" appears on the screen,the viewer already has the necessary image of floating point arithmetic in his head and easily understands the logic of such a formula ... of course, if he performed the exercises correctly. All these 8 lessons are the first part of the course, it is for beginners. The second will be for advanced programmers and is currently in development.



Why video and not text? The explanation is simple: I try different formats and find it difficult for beginners to read. Whoever knows how to read carefully will open the textbook and read the text with formulas, understand and understand. For those who find it difficult to read, who are frightened by complicated things and who find it easier to study the material over a cup of tea, smooth immersion in video format with voiceover is suitable. Plenty of water? Yes, it is possible, but the course is designed even for those people who want to program, but were not friends with mathematics at school. Therefore, what “water” for you is not water, but what you already know well from school, and many of my students are not. Be condescending to them, we all started somewhere. And the video can also be viewed as preparation for reading serious textbooks. Agree, it's nice when you open a textbook and understand what is written in it much faster,since the desired image is already in the head.



About myself: formerly a professional teacher, 11 years worked at a university, taught mathematics and programming, in recent years I have been developing mathematical libraries for high-performance computing. I understand well what my target audience wants, and I well understand what is in demand on this topic in the programming world, and therefore I believe that I have the right to create such courses, and you can see for yourself that similar in quality (in terms of content ) lectures on you will not find Russian now. Check it out! The first four lessons, from which you will already learn a lot of interesting things, are completely free. If you like, you can go through the rest, they are even more interesting, but for a fee. Someone else's work must be respected: I am notI sell knowledge, but I need support to continue my educational work, so my time is worth money. In general, you can draw conclusions about the quality of my work from my other articles on Habré.



For those who join our VK community, I can provide a 50% discount if you contact the PM. Please register with ZealComputing School (it's free) and watch the first 4 lessons. Or they are on YouTube (the first one is here - and further on the links from the description). Yes, you don't need to watch the introductory video I link to, it's just an advertisement.



Summary of paid lessons



Lesson # 5 : For the first time, we move on to the binary number system. We build a beautiful and simple 6-bit floating point model that is very close to the IEEE-754 format. This is the most important and most difficult lesson. The previous four lessons were designed to show where certain things come from in floating point arithmetic, and now you understand how these things appear beautifully in the IEEE-754 format using a toy and understandable example.



Lesson number 6: Introducing rounding. It is not as obvious as in usual mathematics. You will learn what is difficult to see in the simplest examples given by other video teachers. Namely, it is sometimes difficult to convert a number from the decimal number system to the IEEE-754 format so much that some compilers cannot do it correctly. I will explain in detail why everything is so simple in theory, but not in practice.



Lesson number 7: Here you fully master the binary32 and binary64 formats (float and double), show how you can display the bit representation of numbers in C ++ (in other languages ​​it is also possible, but not in all, there I refer you to Google or Yandex and show how simple it is, for example, find a Java solution). After this lesson, the structure of floating point numbers (if you did the exercises well) is completely clear to you and cannot raise questions that were not answered in the previous lessons.



Lesson number 8: A practical guide to using floating point arithmetic. Some of the already described features and new moments: loss of commutativity, associativity, unexpected manifestations of the so-called "equanimity". And the most important tip! This advice will help you to avoid almost 100% of all mistakes in typical non-critical tasks. Next comes a discussion of the double rounding error, the catastrophic loss of significant figures: when and how it occurs. In general, all simple practice that does not require advanced mathematics is described in this lesson.



What else is included in the course? And nothing more is needed! You can ask me questions about the lessons, but I am sure they will not arise. Each lesson contains comprehensive exercises with answers, so my participation in general is not required, hence the low price. A full course with instructor, communication, mentored workshop and live lectures would cost ten times more.



Happy learning!



All Articles