In computer vision, there is a method for measuring the distance to an object without using depth sensors and stereo cameras. In this paper, the method is used to determine the position and speed of an overhead crane trolley.
Due to the fact that the cart is equipped with encoders, I will be able to show how accurately this method, based on the similarity of triangles, works. The article shows how to measure distance with one camera, and how it can be used in practical tasks.
The topic is dedicated to my master's thesis, which I wrote two years ago.
Equipment
The study was carried out in the industrial laboratory of OOO PO Privod-Avtomatika in the city of Magnitogorsk, with an installed girder crane that simulates the operation of a real overhead crane.
During the experiment, a video of the movement of the crane was recorded with a simultaneous procedure for taking readings from the encoders. The speed graphs were obtained using SoMove software from Schneider Electric.
Video was recorded on a Canon EOS 1200D camera with a resolution of 1920x1080.
To detect the cart, a graphic label with a drawn rectangle and a circle inside is used. The solution is not entirely successful, before the experiment I should have better understood the labels. But with the help of contour analysis (area and aspect ratio constraints), I was still able to detect the desired rectangle. I will also add that if the object is easy to detect and the physical size can be accurately measured, a graphic label is not needed.
Work algorithm
The distance calculation method is based on the likeness of triangles that converge on the lens aperture.
Let's measure the area of ββthe mark on paper using a ruler and the area of ββthe mark in the frame using the opencv library. Knowing the focal length, we can calculate the distance to the object.
Experimenting
A digital camera is installed in front of an overhead crane trolley. At a distance sufficient for the viewing angle to cover the entire area of ββthe crane.
Installation diagram. View from above.
Two test videos were recorded, movement along the x-axis - backward, away from the camera and along the y-axis - first to the left and then to the right. The position and time values ββare written to the numpy array and then
exported to Matlab, where S (t) displacement graphs are built.
Differentiating the data, we get graphs of the speed V (t).
x_veloc = np.diff(x_position) / np.diff(time_mas)
Since the motion plots have subtle noise associated with inaccurate edge detection and uneven illumination, the differentiation operation greatly increases this noise.
Let's smooth out the noise using a 1st order filter in Matlab, and compare the encoder readings with the digital camera readings.
The graphs show how accurate the distance measurement can be obtained with one camera.
In order to reduce the noise level and get smoother graphics, a second version of the tag with a visor and local lighting was developed.
In theory, this should make it possible to reduce noise and thereby increase measurement accuracy. Unfortunately, we haven't been able to try out the new version of the tag yet.
For those who are interested in learning more about object tracking based on contour analysis, there is a good article Estimating the accuracy of tracking methods for determining 2d coordinates and velocities of mechanical systems from digital photography data.
In the article, I talked about the simplest method for measuring the distance to an object and showed what measurement accuracy can be obtained. Thank you all for your attention.