How Web Bundles will harm content blockers, security tools, and the open web

Google is proposing a new standard for the web - Web Bundles . This standard allows all website resources to be bundled into one file, which prevents browsers from manipulating links to its child resources. This system threatens to transform the web from a collection of hyperlinked resources (which can be explored, securely downloaded, or even changed) to opaque all-or-nothing chunks of information (such as PDF or SWF ). Organizations, users, researchers, and regulators who believe in an open, transparent, user-friendly web must resist this standard.



While we're grateful that Web Bundles and related suggestionscommitted to addressing specific challenges, we believe there are better ways to achieve the same goals without jeopardizing the open, transparent, user-centered nature of the web. One potential alternative is to use signed commitments for independently downloadable subresources. The description of alternatives will require a separate article, and some have already been shared with the authors of the specification.



The web is a unique open system thanks to URL links



The web is valuable because it is user-centered, user-controlled, user-edited. Even users with little experience can figure out what web resources are on a page and decide what their browser should load. Even if you are not an expert, you can use this and install the appropriate extensions or tools to protect your privacy.



The focus of the Web on users is different from how most applications and information distribution systems work. Most applications are a compiled collection of code and resources that are difficult or even impossible to separate from each other to evaluate separately. And this important distinction explains why there are so many privacy protection tools for the web and so few for "binary" applications.



Basically, what makes the web different from other systems, more open, more user-centered, and different from other applications is the URL. Since a URL usually points to a single resource, researchers and activists can study and draw conclusions about these URLs in advance. Other users can use this information to make decisions about whether they want to download what the URL points to, and how to do it. More importantly, experts can download tracker.com/code.js , decide that it violates privacy, and share that information with others so they know they shouldn't download this code in the future.



There are exceptions to this rule, and there are no such requirements in the web platform โ€” however, URLs are generally expected to be essentially unchanged. These expectations are scattered throughout the web platform - in terms of caching policies, in library instructions for deploying code, and so on.



Web Bundles make URLs meaningless



Google recently proposed three interconnected standards - Web Bundles, Signed HTTP Exchanges (sometimes abbreviated to SXG), and Loading. Basically, in this article, we will mean all three by Web Bundles. So far, Web Bundles have been touting as standards to be used in ad systems (TURTLEDOVE, SPARROW) and as part of Google's future AMP system - although this seems to me to be just the tip of the iceberg.



At a high level, Web Bundles are a way to bundle assets together. Instead of downloading the pages, images, and JavaScript files separately, your browser downloads the entire โ€œbundle,โ€ a file that includes all the information to load the page. URLs cease to be common global links to network resources, and become arbitrary indexes within packages.



In other words, Web Bundles turn sites into PDFs (or Flash SWF files). PDF includes all the images, videos, and scripts needed to render the PDF - you don't download them individually. This has its convenience advantages, but it makes it impossible to examine individual parts of the PDF independently of the entire file. Therefore, there are no content blocking tools for PDF. PDF is an all-or-nothing proposition, and Web Bundles will turn websites into similar propositions.



By changing URLs from meaningful global identifiers to arbitrary package-specific indexes, Web Bundles give advertisers and trackers powerful new ways to bypass privacy and security tools. The following section provides selected examples to illustrate this point.



Web Bundles will allow sites to bypass privacy and security tools



URLs in Web Bundles are arbitrary links to resources within a bundle, not links to globally accessible resources. This gives sites several ways to bypass privacy and security tools.



They can, of course, refer to resources outside of the package, but this behavior would make the package system meaningless, so this issue is not discussed in this article.



The main workaround stems from the fact that Web Bundles creates a local namespace for resources, independent of what the rest of the world sees, which can lead to all kinds of confusion with names, negating the years of work on improving privacy and security that activists have been doing. privacy and researchers. Here are just three ways that Web Bundles-enabled websites can take advantage of this confusion.



Bypassing security tools through URL randomization



Previously, if a website wanted to use a script to track user actions, it would include a <script> tag in the HTML page that points to the same script with an unchanged URL. Researchers or user groups may have added this URL to an EasyPrivacy list so that privacy concerns can visit the site without downloading the tracking script. This is how most blocking tools work today.



Web Bundles make it easier for sites to bypass such tools by randomizing the URL of unwanted resources. What in today's web has a global name, say example.org/tracker.js, in one Web Bundles you can name 1.js, in another 2.js, in the third 3.js, and so on. Web Bundles encourage this practice by making it free for sites. Caching becomes meaningless (since you are distributing all resources to each user and caching the entire package), and URL markup is not necessary (since the package sent to the user already contains randomized URLs).



Bypassing privacy tools through URL reuse



To make matters worse, Web Bundles will allow sites to bypass blocking tools by making the same URLs point to different resources in each bundle. On the current web, let's say example.org/ad.jpg points to the same thing for any user. It is difficult for a website to have the same URL return two different images. As a result, blocking tools can block ad.jpg knowing that they are blocking ads for everyone. There is practically no risk that for some it will be an advertisement, and for others it will be a company logo (this is not impossible, but difficult - the point is that Web Bundles turn the methods of circumvention, which are complex and fragile today, into simple and free).



Web Bundles modify this system in a dangerous manner. Example.org can make its packages such that in one of themexample.org/ad.jpg will point to an advertisement, in another - to the site logo, in a third - to something else. This will not only make it much more difficult for researchers to compile lists, or even make it impossible, it will give sites new ways to poison block lists.



Bypassing privacy tools by hiding dangerous URLs



Finally, Web Bundles open up even more dangerous workarounds. Today, groups such as uBlock Origin and Google's Safe Browsing list URLs that include malicious and dangerous web resources. Such projects consider the URL as the only, or at least the most important, identifier of a dangerous resource. The universal and global nature of the URL makes these lists useful.



Web Bundles once again allow sites to bypass this protection by allowing them to link to known harmful resources through verified URLs. On the web, it is very difficult to get sites to treat cdn.example.org/cryptominer.js as if it were cdn.example.org/jquery.js (and vice versa). In Web Bundles, this will be a trivial task.



Web Bundles ,



Developers and supporters of the Web Bundles specifications argue that there is nothing new here, and all of the above methods of bypassing protection already exist. Technically, this is true, but in essence such a statement ignores the economics of the process, and therefore does not describe the situation as a whole. Web Bundles make these expensive, unreliable, and complex technologies cheap or even free.



For example, websites can actually make multiple URLs point to the same file to make life harder for ad blockers, but in practice it is difficult for sites to do this. URL randomization hurts caching, it requires storing random URL mapping to the desired resources and passing that information to the CDN, and so on. Sites can do this, but it is expensive and difficult, so it is rarely done.



In a similar vein, sites today may use cookies or other user tracking mechanisms to make the same URL work differently for different users, performing the various URL obfuscation attacks described above. However, this method is unreliable (what to do with new visitors?), Complex (you need to maintain and distribute the display of cookie and resource values) and expensive (most web servers and hosting services rely on caching, so such technologies for sites that do not belong to large corporations are practically are not available).



In general, Web Bundles make unwanted behavior much easier by making it cheaper.



Other problems



This article describes the harm that Web Bundles can do to privacy and security tools. But there are other problems with this and related standards. In particular, these are:



  • SXG does not have a kickback system. If a malicious resource accidentally appears on the site today, the site can simply be updated. If a site signs Web Bundles using SXG, there is no clear way for the signer to tell everyone to "no longer trust this particular package."
  • Interoperability with Manifest v3: Manifest v3 restricts extensions to use URL patterns for blocking. Web Bundles make these URLs meaningless. Combining these two things will allow sites to completely bypass blocking.
  • Confusion with sources. Loading + SXG allows you to download content from one server and then execute it with the privacy and security settings of another server. The potential for user confusion is enormous - and while we are confident that Google employees are actively working to address this issue, the risk for users remains very high.


Conclusion



Brave is working to improve web privacy, both in our browser and in the tools we make and distribute, and in the advocacy we make to standards organizations. This article is just one example of our work to ensure that web standards continue to focus on privacy, transparency, and accountability.



We tried for a long time, but to no avail, to pay attentionauthors of the Web Bundles standard on the listed problems. We urge Google and the Web Bundles to put this proposal on hold until the privacy and security issues described in this article are resolved. We also encourage members of the web privacy and security communities to join this discussion and not implement this standard until the issues described are resolved.



One way to do this is to describe these issues in the comments on the post the author of this article made on the Web Bundles project. Other options are to make new specification notes, tell your browser developers how important privacy protection tools are to you personally, and what risk these new standards pose to those tools.



All Articles