Have you ever thought about the fact that the age of custom cryptography is irrevocably gone. No, I'm not saying that encryption of messengers and cryptocurrencies is out of favor today. I'm talking about the good old RBS, in other words, about the bank-client systems. Some 10 years ago, any self-respecting client bank consisted of software and a token with a digital signature. And today it is almost a rarity, everyone is switching to SMS confirmations everywhere, which is certainly an order of magnitude more convenient than fiddling with the settings of crypto-providers and setting CPU certificates.
What's the matter, has convenience really outweighed security in an area like finance? But no, it turns out that the security of both options is equally low. How is it, well, itβs clear sms authentication is far from ideal security with the ability to intercept and replace sms, but the crypt is even in the hardware carrier! The point is that any security system is assessed by its weakest link.
Alice and Bob
I will describe the problem using the example of Alice and Bob. Imagine that Alice has a smartphone in one hand, on the screen of which a payment document is displayed. And in the other hand is a digitally signed wireless token. Alice will have to sign an electronic payment order, for this it is enough to pair the token and the smartphone and click "sign".
As if everything is simple and safe enough - both devices belong to Alice and both are in her hands. But let's say there is Bob, who decides at this moment to intercept Alice's document and send another document to her token - for example, a payment order, in which a large amount will go from Alice's account to an unknown account.
Here we will not consider various scenarios on how to do this, it is enough to list those potentially unsafe areas where Bob can carry out the intended substitution. So, the potential attack area includes: smartphone operating system, processor, video processor, RAM, wireless communication channel. Remote interception of smartphone control allows Bob to organize an attack on any of the listed areas approximately according to the following algorithm: track that the token gateway is open for receiving data, intercept Alice's document on the way from RAM to the wireless transmitter, send his document to the token gateway, receive from the wireless smartphone receiver signed document and send it to a specific host. At the same time, display information about Alice's successfully signed document on the smartphone screen.The attack can be carried out over a wireless communication channel, but in general the attack algorithm will be the same.
Someone may have a question - what about cryptography, is it really impossible to use encryption to protect against Bob. Encryption can be used, of course, but unfortunately, it is impossible to hide the keys from Bob, who took over control of the smartphone, on the smartphone itself. But even if the encryption keys are taken outside the device, this will not save the situation, since it makes sense to encrypt an electronic document only before sending it to a token to protect the wireless channel. All other operations with the document in the smartphone itself are carried out exclusively in an open form. In this situation, the user can only rely on the updated antivirus and bank anti-fraud.
About antifraud
I would like to say a few words about this controversial technology. If anyone has come across this, they will understand me. Any payment suspicious from the bank's point of view is instantly blocked, and with it the account is blocked, followed by a multi-day marathon of collecting documents and proving to the bank that you are a good citizen wanted to transfer money to another good citizen. Well this is so, inspired by recent events from my own experience.
MITM, or rather MID
So we have a kind of man in the middle threat - "Man In The Middle" (MITM). In this case, in the middle between the smartphone screen and the digitally signed token. But unlike the classic man-in-the-middle attack, it is impossible to neutralize it using cryptographic methods. I don't know if this type of attack has its own specific term, we called it a person in a device - βMan In Deviceβ (MID) . In what follows, I will call this attack this way.
The growing capabilities of attackers to remotely intercept control of someone else's computer or smartphone open up prospects for various hacker attacks. Moreover, the greatest damage can be caused by attacks related to electronic signature - the substitution of documents when signing and with the substitution of documents when viewing them. For example, you are shown an electronic passport, electronic power of attorney, electronic ID, electronic ticket, etc. on your smartphone. The authenticity of the document is confirmed by an electronic signature - but how can you quickly and reliably verify that the electronic document is indeed signed with an electronic signature and it complies with the standard? I hope everyone understands that the stamp with the words "signed by a qualified electronic signature" has nothing to do with the electronic signature,and in Photoshop it can be glued to any electronic document.
It's one thing if you on duty need to constantly check the authenticity of such electronic documents - in this case, you should have a certified tool for receiving, viewing and checking electronic documents. For example, a QR scanner and a computer connected to it with pre-installed certified software.
But if you are a simple user or in your organization there is no need to regularly perform these operations and you have an ordinary smartphone to check an electronic document, then you risk becoming a victim of an attack with a document substitution on the screen of your device.
What's on the market?
Are there any methods and devices for neutralizing the Man In Device attack today? Yes, such devices exist, these are user terminals of the Trust Screen class.
Their principle of operation is to physically prevent an attacker from entering the device itself. In fact, this is the same smartphone, but with pre-installed and certified software, with integrated software for working with a digital signature and cut off from the outside world. And it has only two functions - to accept an electronic document of a certain format, display it, sign it and return it back with a signature. Or accept the signed document, display the document itself and a message about the validity of the digital signature. In general, it is convenient, safe and reliable, but I would like to have something more compact for such cases, ideally not requiring charging and something that is always at hand.
With this attitude, 1.5 years ago, we started brainstorming in search of a new solution to the Man In Device threat.
Back to Alice and Bob
For clarity of the found solution, let us once again return to the threat model with Alice and Bob. So, Alice still holds a smartphone in one hand, no matter what brand and model and no matter with what operating system. Let's call it a non-trusted device. In her other hand, Alice holds a certain device with a digital signature, we will assume that this is a certified trusted device manufactured in accordance with all the canons of information security and Bob cannot hack Alice's device. But Bob easily got into Alice's smartphone and is ready to implement the Man In Device attack on it.
You may notice that so far the situation is no different from the one described above, in which Bob easily turns the document spoofing fraud. Where is the solution?
The solution we found, which we called the "trusted viewing effect", does not neutralize these attacks, but allows us to guarantee that the correct document is displayed on the smartphone screen. This principle is similar to the idea of ββquantum cryptography in which the communication channel is not protected, but the parties are guaranteed to be able to determine the attempt to intercept the key. So in our technology - we do not undertake to embrace the immensity in an attempt to protect communication channels, operating system, processor, etc., but our solution is able to reliably determine the fact of substitution or modification of an electronic document on any untrusted devices, be it a smartphone or personal computer.
To understand how our technology works, let's once again imagine that to sign a document, a file with a document is sent from a smartphone to a trusted device. But along the path of the document, the attacker has a lot of ways to replace the document or the data in it. And here the main question arises. How can you ensure that a tiny trusted device that does not have a screen to display the received document has received the correct document? In our solution, we send the document back from the trusted device and re-display its image on the smartphone screen, but providing the document image in the trusted device with certain security labels generated in a special way. The developed technology creates a "trusted viewing effect" due to the principles of visual cryptography - an exotic direction in the field of modern cryptography.
About visual cryptography
One of the most famous methods belongs to Moni Naor and Adi Shamir, who developed it in 1994. They demonstrated a shared secret graphic scheme, according to which the image was divided into n parts so that only a person with all n parts could decipher the image, while the other n-1 parts did not show any information about the original image.
Connecting separated secrets allows you to see hidden information
In our technology, the reverse mechanism works - three secrets are combined into a single whole, which does not allow you to isolate each of the secrets in a strictly limited time. And only dividing the whole into three original parts allows you to see the information hidden in them.
3 secrets are hidden here
secrets are divided
Let's analyze the crypto scheme of the algorithm
The original text consists of three black and white images, each pixel is either 0 - transparent, or 1 - black.
When images are superimposed on each other, the pixels merge according to the following principle:
1) 0 + 0 + 0 = 0
2) 1 + 0 + 0 = 1 or 0 + 1 + 0 = 1 or 0 + 0 + 1 = 1
3) 1 + 1 + 0 = 1 or 1 + 0 + 1 = 1 or 0 + 1 + 1 = 1
4) 1 + 1 + 1 = 1
Thus, the merging of images leads to the loss of some information (lines 2 and 3), which will not allow without a key restore each of the images separately. This is the encryption stage.
Now I'll show you how to generate a decryption key, that is, to decompose a picture back into its original components.
For example, we have before us three fragments of the original images, superimposed on each other:
The encrypted fragment of the image contains black and transparent pixels and has less information than the three original images.
To create a decryption key, you need to run through the black pixels in each fragment of the original images and match them in accordance with the coordinates.
The resulting key will contain the coordinates of the black pixels in the encrypted image and their original content.
In the given example, the following key will be obtained:
a1: 110, c2: 011, d2: 111, b4: 010, a5: 110, c5: 111
where a1 are the coordinates of the black pixel in the encrypted image, and 110 are the colors of the pixels with the same coordinates in the original images.
This crypto algorithm is essentially a one-way function with a key.
About cryptographic strength
Let's calculate the crypto resistance of this algorithm. Each black pixel is the result of merging pixels from three layers. At the same time, as shown above, there are 7 options for such a merge, I will give them again:
Thus, in a small fragment of the image given in the example, consisting of 6 black pixels, there are 7 to the 6th power or 117,649 options for decomposition into original images ...
It is easy to estimate that if, for example, 1000 such pixels appear on a full-size image,then the number of options will be 7 to the 1000th power, which will becombinations.
It is clear that with a given crypto resistance, a direct hacking method is impossible. But the would-be attacker has other cryptanalysis methods.
Taking into account the fact that recently, image recognition systems based on neural networks have made great strides forward - a potential attacker could well use them to recognize an image of an authenticity code on a single image. However, this tool in this case will be powerless for three reasons.
First, the authenticity code is displayed in a deformed form in the form of a captcha. Moreover, the location of the captcha and the angle of inclination are also random.
Secondly, the captcha is mixed with the image of the security grid, which has a random pattern and the text of the document.
And thirdly, the time for an attempt to recognize the attacker is extremely short, 1-2 seconds. This is how much time must elapse from the moment the user presses the button on the trusted device and the image appears on the screen of the untrusted device.
Try to recognize a codeword mixed with a safety net and part of the document text
After discoloration of the protective mesh
The same fragment after the layer with the protective mesh is partially discolored - the code word is easily recognized
Is everything so smooth?
However, an overview of the presented technology would be incomplete without mentioning a potential loophole that a hacker could exploit. The fact is that one of the three layers in a single image is known in advance to the attacker Bob - this is an image with the text of the document. Naturally, Bob also has an image of his own document, with which he is going to replace Alice's document. And then Bob's task can be greatly simplified. Knowing one of the three secrets in the form of a layer with a text image, you can subtract a mask with the image of this secret from a single image. As a result, there will be an image that includes 2 layers - with a protective mesh and an authenticity code. Then the resulting image is mixed with the image of the text of the original document. And that's all - the problem of hacking is solved.
However, everything is not so simple here: in those areas where text pixels are mixed with pixels of the security grid and / or the authenticity code, after subtracting the mask, visible gaps will remain. But if the original text differs from the fake one by a small number of characters on the local fragment, then after the residual image of the security grid with the authenticity code is imposed on the original text, these gaps in most sections will disappear and remain only in a small fragment where the substitution was made. So Alice may not even notice the document substitution.
Decision
To prevent the possibility of hacking according to the above scenario, a fairly simple solution is proposed. In the trusted device, the image of the text of the document is previously slightly warped using random values. The easiest way to deform is to change the width of each of the 4 margins of each page of the document.
Moreover, the width of each field will change to its random value, either in the direction of increasing or decreasing. It is enough to change the width of the fields by a few pixels and this attack will be visible to the naked eye in the form of residual artifacts from parts of the dangling characters after subtracting the mask. If the text of the document contains lines, as in the example below, then when the text is deformed, the line images are not affected.
smooth discoloration of the security grid allows you to make sure that the text of the document does not change.
For ease of understanding, an example with black and white images is shown. But with a certain modification of the algorithm, it can be applied to color images as well.
Now a few words about the user's role in this technology. The use of visual cryptography methods in technology implies the use of the user's vision as the main metric for assessing the absence or presence of a Man In Device attack. In fact, we are talking about visual validation, which is expressed in two successive operations performed by the user:
1.Comparing the authenticity code on the indicator of a trusted device with the authenticity code in the captcha on the document image
as a trusted device, it is proposed to create a special plastic card
with an indicator of an authenticity code and buttons "confirm" and "cancel"
2.Comparison of the text displayed with a gradual decrease in the contrast of layers with a protective mesh and authenticity codes with the original text of the document
The last operation is based on a person's ability to notice the slightest change in the overall picture. Just as a hunter is able to notice moving game in the forest through the branches of trees at the slightest outline of it, so here the user can clearly see that the gradually emerging text of the document does not change when the contrast of the other two layers of the image decreases and corresponds in context to the original.
It is visual validation that creates the final trusted viewing effect.
PS In the next part I will try to talk about what problems I had to face when developing a prototype, how it is possible to transfer some of the complex calculations from a trusted device to a trusted server without compromising security and whether we managed to find an investor in the project.