No Cookies, No Problem - Using ETag to Track Users

As a senior digital analytics consultant at a leading global analytics agency, I watch with great interest the current crusade of modern web browsers against cookies.



It turns out that there is a way to track individual not logged in users without using cookies. I have implemented this too. Now I'll show you how.





For clarity, I created a demo site. Here he is.



Click on each of the three buttons Page โ†’ All three have the same identifier.

Close the browser window and re-open the site โ†’ The identifier has not changed.

Turn off your computer and visit this webpage tomorrow โ†’ The ID is still the same.

Check your cookies โ†’ The demo site does not write or read cookies.

Check URL -> There are no questionable query strings.



So, how exactly can I store the identifier and find out that you return to the site from a certain device, without logging in and without using cookies?
EDISON Software - web-development
EDISON .




โ€” ยซ ยป โ€” -, CRM-, , iOS Android.



, : โ€” ;-)

Cookies



If you are a fairly active Internet user, you have probably faced endless discussions in one way or another regarding cookies and how they are used. Nowadays, browser technologies are increasingly rejecting cookies - especially since everything is now strictly regulated by privacy rules, such as the GDPR or CCPA. While this is certainly progress, as it is an important step towards a more privacy-oriented internet, it also takes a huge toll on the core functionality of most websites, their UX, the economic structure of the internet, and the digital analytics industry. While it is technically very safe for the browser to use a cookie as an identifier for the returning user, there are other web technologies available.based on storing information on a local computer.



The role of the cache



Here is the cache . Basically, web caching means storing data from the internet on your device, so the browser can reuse that data later when the same resource is requested again. For example, when a user first loads a web page, the server sends the entire page to the browser. When the page is cached and the user requests the same page again the next day, the browser remembers it and the server doesn't need to send it again, the page in the browser can be immediately displayed from the cache. It is much faster and provides high throughput. In general, caching technology dramatically increases the speed of web content delivery and also significantly reduces the amount of work done on the server side.



Caching can be done using ETag. These are identifiers that are attached to every resource provided by the server (such as a web page or image). In this way, the server determines whether the user has cached the most recent version of the resource. When a resource on the server changes, a new ETag identifier is generated for that resource.



  • Monday.

    The user visits the website for the first time. โ†’ ETag is missing in the request. โ†’ The site page is sent to the browser with ETag 123. โ†’ The site is saved (cached) on the local device.
  • Tuesday The

    user visits the same site again โ†’ ETag 123 is included in the outgoing request โ†’ The server checks if the resource has changed ("ETag ID remains the same?") โ†’ If ETag has not changed, the server instructs the browser: just use the site that was already delivered and cached on monday. โ†’ There is no need to re-send the web resource, time and traffic are saved. Profit.


Using caching technology to track and identify users



Although ETag is specifically designed for caching, this feature can also be hacked and deliberately used to track users.



Here's how I did it in my example:



  • A simple website with three pages is being created.
  • iFrame . iFrame โ€” 1x1, .
  • - iFrame, PHP . , ETag iFrame, .
  • , (, , iFrame), ETag . , ETag.
  • โ†’ ETag : , . .
  • โ†’ ETag : . ID. .
  • โ€” ETag ID :

    ID / iFrame . , iFrame . JavaScript cookie.






ETag ID iFrame Chrome DevTools.



ETag



This can be tricky. It does not use cookies or local browser storage. Works without JavaScript. And the User-Agent is not used.



However, users have several options for protecting against ETag tracking:



  • Disable caching in the browser settings.

    Be careful here - as explained above, caching can be very useful and has many benefits.
  • headers .

    headers, , ModHeader. ? ETag . , If-None-Match, , ETag . .










Why am I checking these things? Why did I write this article? I, of course, do not intend to use this on a large scale. But while ETag can be used by bad people, this example demonstrates an important point: like most other technologies, by default ETag is not necessarily harmful. Depending on the purpose for which it is used.



I believe that it is important for everyone to know about the existence of such methods. And that they can be used. There have been quite a few cases where sites have used ETag illegally. Some of these incidents were even settled in court. And it's likely that such methods will increasingly be used by the terrified ad industry, which is watching one of its pillars collapse: the coockie.



One of the many (safe) examples of ETags on the internet can be found, for example, in Wendy's privacy policy regarding cookies and tracking technologies:





ETag can generate unique tracking values โ€‹โ€‹even if the user blocks HTTP, Flash, and / or HTML5 cookies.



An ad like this seems to be an example of how many sites use ETag in their privacy policies. To be clear: this in itself is neither bad nor illegal. ETag values โ€‹โ€‹must of course be unique. This is the whole point of their work for caching purposes. However, this section is very vague and ambiguous, especially when it comes to whether these ETag values โ€‹โ€‹are used for tracking or not. And I personally think this is a problem. When inquired with Wendy's privacy department, they responded with standard electronic copy-paste email confirming that ETags are not being used for tracking. The privacy policy, however, leaves this door wide open. And that's what worries me.



I believe in the open and transparent transfer of knowledge across the industry - among analytics providers, publishers, advertisers and Internet users. IMHO, the lack of openness is one of the main reasons why we all found ourselves embroiled in this dirty war with cookies: the Internet ecosystem has always suffered from a lack of transparency, technology is developing too fast for legislation to keep up with them, and people do not understand the many subtleties of web technologies like cookies. And when technology is used improperly, the user understandably feels vulnerable. But banning technology turns out to be a classic case of dealing with symptoms, not cause. The fact that many technology companies are abusing technologies such as cookies,forms an unfair attitude towards technology on the part of the public. Which, in turn, leads to disproportionate action by browser developers and legislation. While these measures are aimed at ensuring privacy, they also harm good and meaningful innovation.



There are always nuances. I strongly believe in the legitimacy and importance of serious digital analytics - as long as it is performed with the proper level of confidentiality. What happens after the store has legitimately identified the visitor? ETag can certainly be used for many different purposes. But one thing is for sure: this topic will never get boring.



All Articles