Stabilizing video from a moving camera, or how to translate everything into a fixed coordinate system

Computer Vision (CV) capabilities are now completely reshaping the Public Safety solutions market landscape. While it is no longer easy to surprise anyone with traditional video surveillance systems, and it is strange not to find it in any public place, the use of AI in this area is still a novelty.



We are investigating the application of CV to various public safety business tasks. In this post, we offer an option to translate video from a moving camera into a fixed coordinate system for further analysis.



The entire project is on GitHub .






Let's say we have some kind of video and we want to build a fixed coordinate system for it in order to evaluate the location of objects relative to each other.



Why is this needed? Very often, in public surveillance tasks, the video that needs to be analyzed is filmed with a moving camera. Because of this, several problems arise in determining the position of objects relative to each other:



  • It is not clear what caused the change in the coordinates of the object: the camera or the object itself is moving;
  • When changing the scene due to the rotation of the camera, different objects can get the same coordinates, even if the objects were static.


image

Figure 1 - Identical objects have different coordinates due to camera movement



In order to build a fixed coordinate system, you must:



  1. Determine the origin of coordinates;
  2. Compare two consecutive frames with each other;
  3. , , (, , ..).


image

2 —



:



  1. .
  2. : , . . . SIFT, SURF ORB. , . , , , .




3 — matching visualization



  1. , .




:



image



  • a, e x y ;
  • b, d — ( a e );
  • c, f — ;
  • g, h — .


, , . (x,y) (x',y') :



image

:



t(x,y,1)=H(x,y,1)(1)



:



k- .

N(f1,..., fN). . matching points , fk fk-1.



:



— ;

(Xk, Yk)=((x1k, y1k),…, (xnk, ynk)) – n matching points;

(X'k, Y'k) =((x'1k, y'1k),…, (x'nk, y'nk)) – n matching points ;

(X''k, Y''k) =((x''1k, y''1k),…, (x''nk, y''nk)) – k — n matching points , fk-1.

Hk – , fk-1 fk.

, .



(Xk, Yk) (X'k, Y'k). f1 fk , .. . Hk.



, (H1,…, Hk-1). Hk (Xk-1, Yk-1) (Xk, Yk), , .



3:



image



3 — ,



, . a :

x1k= x1k-1a, , a : x'1k = x1ka, 3. , , .



?

(H1,…, Hk-1). , 1 k-1 mathcing points fk-1 . (1), , — .



Hsup=H1(H2(H3.........))(2)



, , , fk-1 fk, : (Xk-1, Yk-1) (Xk, Yk) ( (2)), (X'k-1, Y'k-1) (X''k, Y''k) Hk. , , (x1k, y1k) (x'1k, y'1k).



t(x,y,1)=Hsup(x,y,1)(3)



: , ( , , .. ), - , . .

:



  • "" matching points ((x1k, y1k),… ,(x'nk, y'nk)),
  • H, k- k-1 .
  • ((x'1k, y'1k),… ,(x'nk, y'nk))
  • :

    • , ;
    • . , ;
    • - ( LENGTH_ACCOUNTED_POINTS len(matching points)), , , , .


, . .



"" , . , , , , . T , . , motion video segmentation.





.

GitHub , .



  • evenvizion_component.py
  • evenvizion_visualization.py
  • compare_evenvizion_with_original_video.py


evenvizion_component.py

, evenvizion_component.py. , json , fk-1 fk. , json , . , , .



- , json --path_to_original_coordinate recalculated_coordinates.json , .

json :



{"frame_no": [{"x1": x coordinate, "y1": y coordinate}, ...], ...}

evenvizion_component.py , 3 ( matching and heatmap --show_matches --visualize_fixed_coordinate_system ).



evenvizion_visualization.py compare_evenvizion_with_original_video.py .



README.



, .



:



matching points — matching visualization:



image

5 — matching visualization



.

, , (heatmap visualization):



image

6 — heatmap visualization



20 , , . , . : r=sqrt(x2+y2), heatmap_constant , : 0 — , 1 — .





7 — fixed_coordinate_system_visualization



json , , fixed_coordinate_system_visualization ( 7).



evenvizion_visualization.py compare_evenvizion_with_original_video.py , ( ). 8 9 .



image

8 — visualize_camera_stabilization



image

9 — original_video_with_EvenVizion



Known issues



N/a . matching points , , 90 , . video motion segmentation, , , static points motion points. — .



. 4 matching points, , 4 , =None. : none_H_processing True, : Hk=Hk-1. False, H — , . .



. . . :



  • . , , (, ).
  • findHomography() opencv. .




Thus, we get a component that allows us to estimate the real position of objects relative to each other, to translate the coordinates of the object into a stationary system relative to the frame. Because In this solution, the main thing is to evaluate the transformation of planes using key points, then, as shown above, the problem can be solved even in poor shooting conditions (sharp camera movement, difficult weather conditions, shooting at night, etc.).




All Articles