Introduction
In the modern world, the task of segmentation, as it turned out, is solved once or twice, although I thought it was something complicated and surprising. All that is needed is to read a couple of articles, install certain libraries and mark up the data, everything about everything takes a couple of hours of time, except for creating a test sample.
Data markup
This step in the implementation of video stream frame segmentation is one of the most labor-intensive in terms of human work.
To do this, you need to use special programs for marking the frames of the video stream. We just open the required video file with this program, and mark up the frames, having previously created classes into which we will segment the images. In my introductory example, the frames are divided into the following classes (Car, Road, Pedestrian Crossing, Lawn, Buildings, People, Sidewalk, Road Markings).
In fact, positions such as road, buildings, lawns, sidewalks, etc. it is possible not to recognize it, since the camera in this case is rigidly fixed and therefore these areas will always be in the same place.
Training a neural network for image segmentation
( , ).
.
, . 4 (, , ... , , , ) 60 DeepLab v3+ ( ResNet-18). . 2.
, , , . .
, (. 3 4), .
, (. 5 6).
, , (. 7 9). (. 8).
3%, 54%. , . , , , . .
, . ( ), , .
- , . ( , ), . ., . , . , , . , , , , , , .
PS If anyone knows what software can be used to record a video stream from a YouTube stream as simply as possible?) The cameras simply write data to a circular buffer (the last 12 hours) in the form of a YouTube stream, thus forming a video surveillance, where every resident of the house can view the last 12 hours.