Western Digital has developed a new file system for Linux systems

New things rarely happen in the fields of filesystems. We have FAT / 16/32, NTFS, Ext4, Btrfs and other more exotic ways to manage disk space. The file system as a whole is a static phenomenon: once developers and engineers figured out how to structure data on a disk, and since then we all use it without thinking about what is happening β€œunder the hood” at the hardware level.



And now, the drive manufacturer Western Digital has announced that it is actively developing a new DZS file system or Digital Zoned Storage . The main goal of the new system is to use HDD and solid-state drives in industrial equipment with the subsequent reduction of the load on the SSD controller.







The strength of the DZS file system for HDDs is that it simplifies the traditional file access scheme and provides the user with a user-friendly API for data management coupled with SMR tiled recording technology.







In fact, the development will be of interest, first of all, to DBMS administrators and other users who operate with a large array of static data.



The main difference between DZS and other file systems is in its less flexibility: In the development of Western Digital, a file can be recorded only within one zone, and only sequentially. Other modern file systems support random file writing, which is undoubtedly an advantage in some way.



The statement above about the exceptionally sequential recording of a file within the allocated area of ​​disk space is true both for HDD (including tiled recording technology, which only enhances the advantages of this approach) and for SSD.



The new system will get the greatest gain in resource and performance, of course, in SSDs, which do not have physical restrictions on reading different parts of the disk, as happens with HDDs. We will talk about them further. This summer, the DZS system officially became part of the NVMe standard, that is, it is not just about some crazy concept of a file system from Western Digital, but a very real development, which in the near future may become part of the IT market.



Random writing to an SSD is not entirely random. When deleting a file that was scattered over the sectors in parts, the reuse of disk space is possible only if the sector is completely cleared and prepared for reuse. That is, during operation of an SSD, it is constantly subjected to defragmentation so unloved by many, is constantly reorganizing its own internal space. In technical terms, this is called "garbage collection".







The garbage collection process on an SSD creates a regular load on the disk controller: in addition to the operation necessary for the user, it is also engaged in "shadow" work to free block space. These operations, in fact, significantly reduce the life of the SSD controller and lead to premature disk failure. Plus, do not forget about the constant "extra" read-write operations, which reduce the memory resource and lead to its degradation. And of course, it is precisely the need to have a memory buffer for the garbage collection operation that leads to an offensive reduction in the available SSD space for the user.







Western Digital in its new system proposes to return to the practice of sequential writes and, accordingly, to sequential access to files of the same group / application, in order to avoid constant garbage collection on SSDs during delete-write operations.







The system proposed by the manufacturer, in addition to reducing the load on the controller and extending its life, also has a quite tangible increase in performance. The DZS system is able to provide a consistently maximum write speed to SDD, unlike other file systems, which, due to random access and the need to collect garbage during operation, often run into indicators at the level of 200-230 MB / s.







Since Western Digital is an active member of the Linux community (which, however, is to be expected, since the company's main customers are data centers and Unix system administrators), support for the new file system was introduced, first of all, to Linux systems. ...



Now Digital Zoned Storage is already available for use on Long Term Stable (LTS) Kernel versions 4.14, 4.19 and 5.4, however, if you want to take full advantage of the file system, then you should use the 5.x kernel.







It is likely that DZS will be able to compete with existing file systems that are actively used to store large arrays of relatively static data. There are several factors for this:



  • the applicability of the file system for both HDD and SSD;
  • operability of the system with a tiled recording scheme for ultra-dense HDD 20+ Tb;
  • reduced load on the SSD controller;
  • increasing the speed of writing and reading, which is critical for databases and arrays;
  • as a result, reducing the costs of consumers and companies for warranty service.




The last point is extremely important, since we have been living in conditions of a shortage of flash and RAM for many years. At the same time, large companies from the Enterprise segment still remain the main consumer of storage arrays, and databases or other static arrays are still the most popular use case for data warehouses. At the same time, increasing the speed of access to files without increasing the cost of manufacturing equipment is a serious boost for the entire sector.



Useful links on the topic








50% VPS !



:



habrhabr



2 2021 !






All Articles