The task was to analyze and calculate the flow of clients in the office. There are many solutions for similar tasks, like using Convolution Neural Network (CNN), such as YOLO (You Only Look Once), SSD (Single Shot Detection), R-CNN, etc. But since the input data were video fragments of various resolutions and formats, depending on the recorder model and the settings set, it was decided to try the Background Subtraction method. I also wanted to try this algorithm, because I hadn’t come across it before and it was interesting what it is capable of.
This method relies on a background, as the name suggests. The basis is to compare the next frame with the previous ones for changes. That is, if the background has not changed or not strong changes (swaying foliage, movement of clouds, etc.), then this method will not select these areas in the frame and video. There are also a huge number of internal algorithms on which Background subtraction is based, which determine changes in different ways. Some algorithms are very sensitive to changes, that is, light rain, not strong changes in tree crowns due to wind, all these objects will be visible on the algorithm mask. Other algorithms build masks very roughly, combining many pixels into one object, that is, two people moving side by side will be defined as one person,therefore, it is important to choose the right algorithm for your task and try different settings (the number of frames for comparison, the border for cutting off areas, etc.)
There are also various settings within the algorithm that can improve the quality, and the final mask looks much better for identifying objects.
After additional settings and writing additional code to highlight variable areas and further counting the client flow, it turned out to achieve a good result in terms of counting the number of people, as it was the first experience with the methods of the CV2 python library, without using neural networks.
Unfortunately, this method has its drawbacks, it is the selection of some artifacts, plus it has limited functionality and a narrow scope of use, but as experience and familiarity with the capabilities of Computer Vision, an excellent opportunity.
I propose to use my experience in using open source tools and services to solve computer vision problems.