This year was ... unusual. We are very glad that it is finally ending and we hope that the next one will be happier for everyone. And according to our little tradition, we sum up the blog with the fattest posts by plus. In addition, we publish champions by comments, additions to favorites and views for those who were not included in the first shortlist.
Ranked champions
Zip files: history, explanation and implementation (AloneCoder)
I've been wondering for a long time how data is compressed, including in Zip files. One day I decided to satisfy my curiosity: learn how compression works and write my own Zip program. Implementation has become an exciting programming exercise. You have a lot of fun building a fine tuned machine that takes data, shifts its bits into a more efficient representation, and then puts it back together. I hope you find it interesting to read about this too.
Boxed Cats, or Compact Data Structures (Serine)
What if the search tree has grown over the entire RAM and is about to root up neighboring racks in the server room? What to do with an inverted resource-greedy index? Whether to tie up with the development for Android, if the user arrives "Phone memory is full", and the application is barely halfway through the load of an important container?
In general, is it possible to compress a data structure so that it takes up noticeably less space, but does not lose its inherent advantages? So that access to the hash table remains fast, and the balanced tree retains its properties. Yes you can! For this, the direction of informatics "Succinct data structures" appeared, which explores the compact representation of data structures. It has been developing since the late 80s and right now it is flourishing in the glory of big data and highload.
What I, as a developer, learned from accidents in space (AloneCoder)
Andrey Sitnik, author of PostCSS and Auto Prefixer , compiled a selection of stories related to the Soviet Union's space exploration. You will find out what lessons Andrey learned from them in order to grow as a developer and participant in the open source movement. Failed docking, dramatic re-entry, and unique transition along the handrail between spaceships - what does all this have to do with modern web development? Read about all this in the post!
Habr Converter: to make layout easy (AloneCoder)
Surely many of you have used the habraconverter at least once, which is officially recommended by the Habr administration - https://shirixae.github.io/habraconverter-v2/ . Several years ago it was created by the meta4 resident of Khabrav , and then modified by Shirixae . The principle is simple: open a google with a post, Ctrl-A, Ctrl-C and paste it into the converter window. Press the "Convert" button and you will get a ready-made layout code that can be inserted into the Habr editor and published. Just before that, you need to walk and fix some little things.
And everything would be fine if you don't have to typeset too often. Or posts are small, uncomplicated. But if you typeset a lot, and the posts contain pictures, tables, and pieces of code, then from time to time you have to do a routine: insert the necessary blank lines and remove unnecessary ones, replace the <surce> tags with <code>, etc. etc. etc. We decided to spend a day so that we could fly in an hour later, and finished the converter.
SHISHUA: the world's fastest pseudo-random number generator (AloneCoder)
Half a year ago I wanted to create the best pseudo-random number generator (PRNG) with some unusual architecture. I thought the beginning would be easy, and as you work, the task will slowly get harder. And I thought if I could learn everything quickly enough to cope with the most difficult.
How pipelines are implemented in Unix (AloneCoder)
This article describes the implementation of pipelines in the Unix kernel. I was somewhat disappointed that a recent article titled How Do Pipes Work in Unix? "Was not about the internal structure. I got curious and dug into old sources to find the answer.
About one vulnerability in ... (z3apa3a)
On March 21, 2019, a very good bug report from maxarr came to the Mail.ru bounty program bug on HackerOne . When a zero byte (ASCII 0) was embedded in the POST parameter of one of the webmail API requests that returned an HTTP redirect, chunks of uninitialized memory were seen in the redirect data, in which fragments from GET parameters and headers of other requests to the same server.
This is a critical vulnerability. requests include session cookies. A few hours later, a temporary fix was made that filtered the zero byte (as it turned out, this was not enough, since there was still the possibility of CRLF / ASCII 13, 10 injection, which allows manipulating the headers and data of the HTTP response, this is less critical, but still unpleasant). At the same time, the problem was transferred to security analysts and developers to find and eliminate the causes of the bug.
ZFS: architecture, features and differences from other file systems (gmelikov)
I, Georgiy Melikov, am a contributor to the OpenZFS and ZFS on Linux projects. I also work on IaaS development in the Mail.ru Cloud Solutions team . Although we do not use ZFS in the production of our division, the hosts of the SDCast podcast invited me to talk about it. This article was born from the issue, and here you can listen to the audio version .
So, today I am talking about ZFS. How the ZFS file system works, what components it consists of and how it works, as well as about new features that have appeared or will soon appear in recent releases.
Why we chose MobX over Redux, and how to use it more efficiently (ngOo)
My name is Nazim Gafarov, I am an interface developer at Mail.ru Cloud Solutions . The year is 2020, and we continue to discuss the "innovations" of ES6 syntax and the advantages of MobX over Redux. There are many reasons to use Redux in your project, but since I don't know any, I'll tell you why we chose MobX.
How UUIDs (AloneCoder)
You've probably already used UUIDs in your projects and thought they were unique. Let's take a look at the main aspects of the implementation and see why UUIDs are practically unique, since there is a tiny possibility of the same values occurring.
The modern implementation of UUIDs can be traced back to RFC 4122, which describes five different approaches to generating these identifiers. We'll go over each of them and walk through the implementation of version 1 and version 4.
Commentary Champions
The math of climbing the wealth ladder (randall)
From the point of view of a programmer, the average monthly salary in Russia is 44 thousand rubles. - below all expectations of wealth and success. But where do the ideas about success, financial prosperity and the methods of achieving them come from?
How will your life change if you give you 10 thousand rubles? What about RUB 1 million? Or 100 million rubles? The answer to this question is not as simple as it seems, and depends on your age, marital status and current savings. More importantly, the changes that occur to your behavior after you have that amount can tell a lot about your current financial situation.
Stuart Butterfield, creator of Flickr and Slack, has developed this idea into a distinctive "pyramid of wealth" concept that leads to the paradoxical conclusion that even big money doesn't necessarily improve your life in any measurable way.
Cyberbullying: why people become observers (kseniaegorova)
We conducted a large-scale study and studied the behavior of people who, when faced with aggression directed at others, take an observant position. The results of the 2019 study showed that there are 60% of such in RuNet. And this year we learned why the observers are inactive, what makes them help victims of aggression, and whether there is a difference between their actions on the Internet and in real life.
The research was carried out in September-October 2020 in collaboration with Research.me, the Mail.ru Group UX laboratory and UXSSR. Hereyou can download the full research results. Some of them are very sad for our society. The second quarantine, the economic crisis, the eve of winter - all this does not help people to be kinder and more tolerant of each other. This is confirmed by the survey: half of the respondents believe that the level of aggression in society has increased during the pandemic. Moreover, this aggression on the Internet is often not justified. It is curious that people consider rudeness and insults on the Internet unacceptable - but they are ready to use them for self-defense. The analogy with physical assault is straightforward.
What programming language to learn so that HR of large companies hunt for you (DmtrKzmn)
Ten years ago, the PHP programming language was at the top of the ratings, and now projects on the web are increasingly written in JavaScript and Python. It's a shame to spend a year or two on learning a language and then be unemployed.
At Mail.ru Cloud Solutions, we studied analytics, research, opinions of developers and large employers about which programming languages will be in demand in the coming years. And we tell you what to focus on when choosing.
Add to favorites champions
How to Make Your Life Easier Using Git (and Deep Dive Kit) (pxeno)
For those who use Git every day but feel insecure, the Mail.ru Cloud Solutions team has translated an article by front-end developer Shane Hudson . Here you will find some tricks and tips that can make working with Git a bit easier, as well as a selection of more advanced articles and tutorials. Git came out almost 15 years ago. During this time, he went from underdog to undefeated champion. New projects today often start with the git init command. Undoubtedly, this is an important tool that many of us use on a daily basis, but it often resembles magic - bright but dangerous. Many articles have been published on Habré on how to get started with Git, how Git works under the hood
, describing the best branching strategies. Here, the author focuses on how to make working with Git easier.
Self-development: how I did not sit on two chairs and found a third (EdT)
I lead the anti-spam team at Mail.ru Group, as well as several machine learning teams. The topic of this article is self-development for team leaders / leaders, but in fact, many techniques and recipes are completely independent of the role. For me, this question is very relevant, since machine learning is developing extremely rapidly, and in order to at least be in the subject, you need to spend a lot of time. Therefore, the question of how and on what to spend time for development is quite acute.
The content of the article, of course, is not the ultimate truth, but just a description of the results of my ongoing quest, which outlines the approaches that worked for me, based on books and training, on trial and error. I will be glad to discuss with you in the comments.
Python - (pxeno)
It happens that a company is looking for a data scientist, but in fact it needs a Python developer. Therefore, when preparing for an interview, it makes sense to brush up on Python information, and not just study the algorithms.
The Mail.ru Cloud Solutions team translated an article from a developer who has repeatedly found himself in such a situation and based on his experience compiled a list of 53 questions and answers to prepare for an interview. Most data scientists write a lot of code, so this list is useful for both data scientists and engineers. It will be useful for job seekers, interviewers, and those who are just learning Python.
View Champions
Contemplation of the great fractal similarity (randall)
Fractals are not just a beautiful natural phenomenon. According to our research , viewing the fractal structures by 60% increases stress, measured on the basis of physiological parameters. When contemplating fractals in the frontal cortex of the brain, in just one minute, the activity of alpha waves increases - as during meditation or when feeling mild drowsiness.
Not surprisingly, fractal biodesign has a calming effect on humans. We like to look at the clouds, at the flames in the fireplace, at the foliage in the park ... How does it work? Scientists suggest that the natural course of our eyes' search movements is fractal. When the dimensions of the trajectory of eye movement and the fractal object coincide, we fall into a state of physiological resonance, due to which the activity of certain parts of the brain is activated.
But not all fractals are created equal. In this article, we will talk about the fractal dimension and its impact on health.
Determining COVID-19 on X-rays with Keras, TensorFlow and Deep Learning (AloneCoder)
With the help of this tutorial, we will use Keras, TensorFlow and deep learning to automatically detect COVID-19 from a manually collected dataset from X-rays.
Mnemonics: exploring methods for increasing brain memory (randall)
A good memory is often innate in some people. And therefore there is no point in competing with genetic "mutants", exhausting yourself with training, including memorizing poems and coming up with associative stories. Since everything is written in the genome, you cannot jump over your head.
Indeed, not everyone is given to build, like Sherlock, palaces of memory and visualize any sequence of information. If you tried the basic techniques listed in the article on mnemonics on Wikipedia, and you didn't succeed, then there is nothing wrong with that - memorization techniques for an overworked brain become a super task.
However, it's not all bad. Scientific research shows [ 1 ]that some mnemonics can literally physically change the structure of the brain and improve the skill of memory management. Many of the world's most successful mnemonists competing in professional memory competitions started learning in adulthood and were able to greatly enhance their brain power.
Thanks to everyone who read it. And Happy New Year!