Computer vision guards privacy





Prevention is our everything. Competent protection against data leaks will help prevent negative consequences that can entail multi-million dollar losses. In the modern world, each of the organizations processes and stores confidential information. When it comes to large organizations, the volumes of confidential information are enormous. The state of "security" of a computer is a conceptual ideal achieved when each user observes all information security rules.



While in the office, any employee from time to time leaves his computer, and then the turned on computer is unattended, often with files and folders open for access, since many employees simply forget to lock their PC, or they do not do it on purpose, guided by a similar phrase - "I moved back five meters, why block it !?" Unfortunately, such moments can be used by other employees interested in the materials.







How can you ensure data security in such situations? One of the ways could be the use of biometric authentication technologies that allow recognizing users by their faces.



Face recognition is not a new concept, and currently there are many tools to accomplish this task. If you are not particularly versed in methods and tools for face recognition, then using the Open source computer vision library (Open source computer vision library) and the Python programming language will be an excellent solution that will allow you to achieve your goal as quickly as possible.







We have decided on the software, we need one more hardware - a web-camera. How do you set up face recognition now? First, you need to detect the face in the frame. One of the methods for detecting faces is the Viola-Jones method, which was described already in 2001. Since we are more interested in practice, we will not go into theoretical details, we will only mention that the method is based on such basic principles as the transformation of an image into an integral representation, the scanning window method and the use of Haar features. You can read the description of the method and see information about installing OpenCV on the official website . Below is the Python code that will allow us to detect faces from a webcam video stream:



import cv2 as cv
import numpy as np

face_cascade = 'haarcascade_frontalface_default.xml'
cam = 1

classifier = cv.CascadeClassifier(face_cascade)

video_stream = cv.VideoCapture(cam)
while True:
    retval, frame = video_stream.read()
    gray_frame = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    found_faces = classifier.detectMultiScale(gray_frame, 1.2, 5, minSize=(197, 197))

    for x, y, w, h in found_faces:
        cv.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 2)

    cv.namedWindow('Stream', cv.WINDOW_NORMAL)
    cv.imshow('Stream', frame)

    k = cv.waitKey(30) & 0xFF
    if k == 27:
        break

video_stream.release()
cv.destroyAllWindows()


In the code above, we first import the two modules required for work: cv2 (computer vision) and numpy (working with matrices). In the variable face_cascade, save the path to the xml file with Haar cascades. This file, as well as others, for example for eye detection, can be found on the github page .



In the variable cam we write the number of the web camera, if there are several connected, the number of the only connected camera by default is 0. Next, a classifier object is created using the CascadeClassifier method and the connection to the camera is cv.VideoCapture (cam). Then, in a loop, we read the images frame by frame into the frame variable using the read () method. The classifier processes an image in grayscale, therefore, using the cvtColor method, we transform the image into the desired form. The detectMultiScale method returns a list with parameters of all detected faces, which is a rectangle - the vertex coordinate (x, y) and the width and height of the rectangle (w, h). The following lines are optional for the program to work, but useful for debugging - the rectangle method adds a frame to the source frame at the location of the detected face, and imshow - displays a window with a video stream.



Everything is quite simple here, but more interesting further. Now we need to recognize the face. How to do it? OpenCV contains several methods for face recognition, among which there is the LBPH (Local Binary Patterns Histograms) method. Here we will dwell in more detail and understand a little how it works.



In general, the brightness values ​​of the image pixel and eight more pixels surrounding the original are taken. It turns out a 3x3 table with pixel brightness values. Then 0 or 1 are written to the same table. If the brightness value of the outermost pixel exceeds the brightness value of the central one, 1 is set, otherwise - 0. Then the resulting code is read clockwise from the upper left cell, converted into a decimal number, and this number is written into an image-sized matrix to the appropriate position. And so for each pixel.







The matrix is ​​divided into a certain number of segments (by default it is an 8 Γ— 8 grid), a histogram is built for each segment, and at the end, by concatenating the histograms, a resultant characterizing the entire image is obtained. During recognition, the same histogram is plotted for the studied image, which is then compared with the training data.







We will use this particular method, but first you need to do another important step - create a base with faces. In general, the structure of the database looks like this:



--Storage\\
	--Person_1\\
		--img_1.jpg
		--img_2.jpg
		…
		--img_n.jpg
	…
	--Person_m\\
                --img_1.jpg
		--img_2.jpg
		…
		--img_n.jpg


Ok, we have a base with images of faces. Now you need to somehow process the base for training or training the algorithm. The processing boils down to generating a list of all images and a list of id or tags for each person. The simplest code for this action might look like this:



storage = 'storage'
images = []
labels = []
names = {}
id = 0

for subdir in os.scandir(storage):
    names[id] = subdir.name
    subdir_path = os.path.join(storage, subdir.name)

    for file in os.scandir(subdir_path):
        if file.name.split('.')[-1].lower() not in ['png', 'bmp', 'jpeg', 'gif']:
            continue
        file_path = os.path.join(subdir_path, file.name)
        image = cv.imread(file_path, 0)
        label = id
        images.append(image)
        labels.append(label)
    id += 1


Here, too, everything is pretty simple. The storage variable stores the path to the folder containing the folders with images of individuals' faces, then a list for images, a list for labels and a dictionary with names. How it works: all images from the folder of a certain person are added to the list of images, for example, there are 15 of them. If this is the first folder from the storage, then the label will be equal to 0, thus, 0 as many as 15 times are added to the list of labels, at the same time In the dictionary of names, a record is created in which the key is the label, and the value is the name of the person (the name of the folder with the images of a particular person). And so for the entire storage. In the above code, you should pay attention to the line with the imread method - here the image is read and represented as a matrix of pixel brightness, and written to the image variable.



Now the fun part is training the algorithm:



recognizer = cv.face.LBPHFaceRecognizer_create(1, 8, 8, 8, 130) 
recognizer.train(images, np.array(labels))


In the first line of code, we initialize the algorithm using the LBPHFaceRecognizer_create method. Remember the description of the LBPH algorithm? The parameters of the mentioned function just contain what we talked about: the radius along the border of which the pixels will be taken around the desired one, the number of pixels from the "circle" formed by the radius, the number of segments horizontally and vertically, and the threshold that affects the recognition decision persons, that is, the stricter the requirements, the lower the threshold. Next, we call the train method for training, passing lists of images and labels as arguments. Now the algorithm has memorized the faces from the database and will be able to recognize them. The matter is small, it remains to add a few lines to the first piece of code in the loop (for x, y, w, h in found_faces): after detecting a face, we need to recognize it, and if the face is not recognized,or another person is recognized, then immediately lock the computer!



roi = gray_frame[y:y+h, x:x+w]
name, dist = recognizer.predict(roi)
cv.putText(frame, '%s - %.0f' % (names[name], dist), (x, y), cv.FONT_HERSHEY_DUPLEX, 1, (0, 255, 0), 3)
if names[name] != "Ivan Petrov":
    ctypes.windll.user32.LockWorkStation()


In the first line, roi (from region of interest) is a variable into which we write a fragment of the image containing the detected face. The next line directly recognizes the face when calling the predict method. This method returns a label corresponding to the face that was recognized and a value characterizing the degree of discrepancy between the detected face and the recognized one (the smaller it is, the higher the degree of confidence that the correct face was recognized). Next, again for debugging purposes, add text to the frame with the name of the recognized person using the putText method. And finally, we check the simplest condition: if a non-owner of the PC was recognized (this is where a dictionary with names was needed), then lock the computer. The line ctypes.windll.user32.LockWorkStation () is responsible for blocking, as you can see. To make it workwill need to import the ctypes module besides cv2 and numpy.



As a result, the PC is blocked as soon as the face of another person was recognized, or it was not recognized in principle. After unlocking the PC, the program continues to run. You can also add a PC lock on the event of an employee leaving his workplace. You don't have to think long to understand that there are quite a few nuances here. For example, what if another face is recognized in the background? In this situation, you can set the minimum size of an object that looks like a face, then faces in the background will be ignored (for this, there is a minSize parameter in the detectMultiScale method). In other situations, good solutions can also be found if desired.



One of the most important factors for the blocking to work correctly is the quality of the images in the photo database. Firstly, it is desirable that there are many of them for a particular person, with different angles and facial expressions. Secondly, lighting makes its own adjustments, so it is worth considering this point and using images taken under different lighting conditions. And thirdly, you should record images from the web camera that is at the employee's workplace; there is also a method for saving images in OpenCV. As for the code, for sure it will be expanded, additional functionality will be added, so it can be β€œwrapped” in functions, using classes. There is no limit to perfection! The main thing is to remember the order of the program's actions: processing the database with photographs, training, detection, recognition.



At the webinar 03.09.2020 10-00 Moscow time, speakers will present a practical method for training a neural network for detecting objects with the source code and technologies used, and will also answer your questions. You can register at the link: newtechaudit.ru/vebinar-computer-vision






All Articles