Browser fingerprint: what it is, how it works, whether it violates the law and how to protect yourself. Part 1

image


From Selectel: This article is the first in a series of translations of a very detailed article on browser fingerprints and how the technology works. Collected here is everything you wanted to know, but were afraid to ask on this topic.



What are browser fingerprints?



This is the method used by sites and services to track visitors. Users are assigned a unique identifier (fingerprint). It contains a lot of information about the settings and capabilities of the user's browser, which is used to identify them. In addition, browser fingerprint allows sites to track behavioral patterns in order to further identify users more accurately.



The uniqueness is about the same as that of real fingerprints. Only the latter are collected by the police to search for suspected crimes. Browser fingerprint technology is not being used to track criminals. We're not criminals here, are we?



What data does browser fingerprint collect?



We knew that a person can be tracked by IP since the dawn of the Internet. But in this case, everything is much more complicated. The browser fingerprint includes the IP address, but this is far from the most important information. In fact, no IP is needed to identify you.



According to a study by the Electronic Frontier Foundation (EFF) , browser fingerprint includes:



  • User-agent (including not only the browser, but also the OS version, device type, language settings, toolbars, etc.).
  • Timezone.
  • Screen resolution and color depth.
  • Supercookies.
  • Cookie settings.
  • System fonts.
  • Browser plugins and their versions.
  • Visit log.


According to the EFF study, the uniqueness of the browser fingerprint is very high. If we talk about statistics, then only once in 286,777 cases there is a complete coincidence of the browser fingerprints of two different users.



According to another study , the accuracy of user identification using a browser fingerprint is 99.24%. Changing one of the browser settings reduces the accuracy of user identification by only 0.3%. There are browser fingerprint tests that show how much information is collected.



How browser fingerprint works



Why is it even possible to collect information about the browser? It's simple - your browser communicates with the web server when you request a website address. In a normal situation, sites and services assign a unique identifier to the user.



For example, "gh5d443ghjflr123ff556ggf" .



This string of random letters and numbers helps the server know you, associate your browser and your preferences with you. Actions that you take online will be assigned approximately the same code.



So, if you went to Twitter, where there is some information about you, all this data will be automatically associated with the same identifier.



Of course, this code will not be with you for the rest of your days. If you start surfing from a different device or browser, the identifier will most likely change too.







How do sites collect user data?



It is a two tier process that works on both the server and client side.



On the server side



Logs of access to the site



In this case, we are talking about the collection of data sent by the browser. At least this:



  • The requested protocol.
  • The requested URL.
  • Your IP.
  • Referer.
  • User-agent.


Headers



Web servers receive them from your browser. Headers are important because they make sure the requested site works with your browser.



For example, header information lets the site know if you're using a PC or mobile device. In the second case, there will be a redirect to the version optimized for mobile devices. Unfortunately, the same data will end up in your fingerprint.



Cookies



Everything is clear here. Web servers always exchange cookies with browsers. If you specify the ability to work with cookies in the settings, they are stored on your device and sent to the server whenever you visit a site that you have already visited before.



Cookies help you surf more comfortably, but they also reveal more information about you.



Canvas Fingerprinting



This method uses an HTML5 canvas element, which WebGL also uses to render 2D and 3D graphics in the browser.



This method usually "forces" the browser to process graphical content, including images, text, or both. This process is invisible to you, since everything happens in the background.



Once the process is complete, canvas fingerprinting turns the graphics into a hash, which becomes the unique identifier we talked about above.



This method allows you to get the following information about your device:



  • Graphics adapter.
  • Graphics adapter driver.
  • Processor (if no dedicated graphics chip is available).
  • Installed fonts.


Client-side logging



This assumes that your browser is exchanging a lot of information thanks to:



Adobe Flash and JavaScript



According to the AmIUnique FAQ , if you have JavaScript enabled, data about your plugins or hardware specifications is transmitted outside.



If Flash is installed and activated, this provides the outside observer with even more information, including:



  • Your time zone.
  • OS version.
  • Screen resolution.
  • Complete list of fonts installed on the system.


Cookies



They play a very important role in logging. So, you usually need to decide whether to allow the browser to process cookies or delete them entirely.



In the first case, the web server receives just a huge amount of information about your device and preferences. If you do not approve of using cookies, sites will still receive some data about your browser.



Why is browser fingerprint technology needed?



Basically, in order for the user of the device to receive a site optimized for his device, regardless of whether he went to the Internet from a tablet or smartphone.



In addition, the technology is used for advertising. It's just the perfect data mining tool.



So, having received the information collected by the server, suppliers of goods or services can create very finely targeted advertising campaigns with personalization. The targeting accuracy is much higher than using just IP addresses.



For example, advertisers can use browser fingerprints to get a list of site users with a screen resolution that can be called low (for example, 1300 * 768) who are looking for better monitors in the seller's online store. Or users who just surf the site with no intention of buying anything.



This information can then be used to target ads for high-quality, high-resolution monitors to users with a small and obsolete display.



In addition, browser fingerprint technology is also used to:



  • Fraud and botnet detection. This is a really useful function for banks and financial institutions. They allow you to separate user behavior from the activity of attackers.
  • VPN proxy . - IP-.






Ultimately, even if browser fingerprints are used for legitimate purposes, it still has a very negative impact on user privacy. Especially if the latter are trying to protect themselves with a VPN.



Plus, browser fingerprints can be a hacker's best friend. If they know the exact details of your device, they can use special exploits to compromise your device. There is nothing difficult about that - any cybercriminal can create a fake site with a fingerprint script.



Recall that this article is only the first part, there are two more ahead. They address the legality of the collection of personal data of users, the possibility of using this data and methods of protection against too active "collectors".






All Articles