Despite the fact that more and more tracks are placed on each square millimeter of platters over time, and the read / write heads become more complex every few years, the reliability of hard drives is constantly growing. So, the company Backblaze, which is engaged in data storage, has prepared a report for the second quarter of 2020, dedicated to the hard drives that are used in it. As it turned out, the annual rate of disk failures, compared to last year, dropped significantly.
Does this mean that HDDs become more reliable over time? How promising magnetic recording technologies like MAMR and HAMR could affect hard drive reliability in the coming decades?
The path from "mega" to "tera"
The first HDD went on sale in the 1950s. It was an IBM 350 with a capacity of 3.75 MB. The device was enclosed in a 152x172x74 cm case, which consisted of 50 discs with a diameter of 24 inches (610 mm). If you go back to our time, it turns out that the best modern 3.5-inch hard drive (approximately 14.7 Γ 10.2 Γ 2.6 cm) can store up to 18 TB of data using conventional (not tiled) recording technology.
Internal Mechanism of the IBM 350 Hard Drive
In the IBM 350, the platters rotated at 1200 rpm. And in recent decades, hard disks have been developing towards reducing the diameter of the plates and increasing their rotation speed (typical values ββare 5400-15000 rpm). Other improvements include placing the read / write heads closer to the platter surface.
The IBM 1301 DSU (Disc Storage Unit) storage device came out in 1961. It was an innovative design where each platter had its own read / write head. Another innovation of this model was that the heads floated above the surface of the rotating disc under the action of aerodynamic forces. This made it possible to reduce the gap between the heads and the disc surface.
After 46 years of development, IBM sold its HDD business to Hitachi in 2003. By that time, the capacity of hard drives had increased 48,000 times, and their size had decreased 29,161 times. Power consumption dropped from over 2.3 kW to about 10 watts (for desktop models), and the price per megabyte fell from $ 68,000 to $ 0.002. At the same time, the number of plates decreased from tens to, at most, two.
Increasing storage density
Mechanical and electronic devices, as well as computers, have always evolved towards miniaturization. The huge tube or relay computers of the 1940s and 50s evolved into less bulky transistor systems, and then into modern miniature wonders of technology based on specialized integrated circuits. Hard drives have come a similar path.
Inside the 1-inch Seagate MicroDrive HDD
The control electronics of hard drives have experienced all the delights of VLSI development, they used increasingly accurate and economical servo drives. Advances in materials science have resulted in lighter and smoother plates (glass or aluminum) with improved magnetic coating. The recording density grew. The creators of hard drives were getting better at understanding the peculiarities of their individual elements (microcircuits, solders, drives, read / write heads) and revolutionary improvements in their characteristics did not occur immediately, but gradually, through small improvements.
Six open hard drives - from 8 "to 1" ( source )
Although there have been at least two attempts at serious miniaturization of hard drives, taking the form of the 1.3 " HP Kittyhawk in 1992 and the 1" Microdrive in 1999, the market As a result, he made his choice, focusing on the models of 3.5 and 2.5 inches form factors. Drives in the Microdrive form factor were touted as an alternative to NAND-based CompactFlash cards, citing their strengths as higher storage capacities and virtually unlimited write cycles, which made them suitable for use in embedded systems.
As in other similar cases, physical limitations on write speed and on the time of random access to data ultimately made HDDs the most welcome guests where the most important thing is the ability to cheaply and reliably store large amounts of information. This allowed the HDD market to adapt to desktop and server systems, as well as to the needs of video surveillance and data backup (here they compete with tape drives).
Reasons for hard drive failures
Although the mechanical parts of hard drives are often viewed as the weakest point, hard drive failures can be caused by more than just those parts. Among such reasons are the following:
- Human factor.
- Hardware failures (mechanical, electronic).
- Damage to the firmware.
- Environmental factors (temperature, humidity).
- Power supply.
Hard drives are tested for shock resistance with the power off or during operation (platters rotate, read / write heads are not parked). If the disc is subjected to more intense stress than it was designed for, the drive responsible for the movement of the heads can be damaged, the heads can collide with the surface of the disc platters. If the disk is not subjected to such influences, then the main reason for its failures, most likely, will be its natural wear. Hard drive manufacturers give them a mean time to failure (MTBF, Mean Time Before Failure), which gives an idea of ββhow long a hard drive can work under normal conditions.
MTBF is obtained by extrapolating device wear data over a period of time. There are standards according to which this indicator is calculated. The MTBF for hard drives is typically somewhere in the range of 100,000 to 1 million hours. Therefore, in order to truly test the disk, it would take 10 to 100 years to observe it. At the same time, manufacturers, when specifying MTBF for discs, proceed from the assumption that the disc will work in the recommended conditions. This is where drives work at storage companies like Backblaze.
Obviously, if you expose the hard drive to a very strong impact (drop it on a stone floor, for example), or if a serious failure occurs in the power supply of the drive (say, a power surge), the life of the HDD will be reduced. Less obvious is that the reliability of hard drives can be affected by manufacturing defects that are not unique to hard drives. They are the reason why such a metric as "acceptable failure rate" applies to most products.
It's not about the user. It's about the production line
Hard drives demonstrate high MTBF values. The Backblaze company, understandably, strives to ensure that almost 130,000 of its HDDs would happily "live" to a ripe old age and calmly retire to a better world (usually a scrap metal crushing plant). But even a company like Backblaze reports, as of Q1 2020, an Annual Failure Rate (AFR) of 1.07%. This is, fortunately for them, the lowest rate since they started publishing such reports in 2013. In the first quarter of 2019, for example, their AFR was 1.56%.
In one of my materialsit was said that during the production of devices that include integrated circuits, defects may appear in them, which do not show themselves immediately, but after a while, during the operation of the devices. Over time, factors such as electromigration, thermal stress, mechanical stress can lead to failures in microcircuits. So, wire connections in the cases of microcircuits can be broken, electromigration can damage the soldered connections and the microcircuits themselves (especially after exposure to an electrostatic discharge device).
The mechanical parts of hard drives depend on the accuracy of adherence to technological tolerances and the quality of lubrication of moving parts. Previously, there was such a problem as the sticking of the block of magnetic heads on the surface of the hard disk (stiction ). But over time, the characteristics of the lubricants have improved and the head block can no longer leave the parking area. As a result, this problem has been more or less solved today.
But, nevertheless, at every step of the production process there is a chance to spoil something. This ultimately manifests itself as something that degrades the pretty MTBF numbers. The hard drive that fails falls on the dark side of the failure rate curve. This curve is characterized by a high peak at the very beginning, indicating failures due to serious manufacturing defects. Then the number of defects decreases and it looks calm enough until the device expires, after which it goes up again.
What's next?
It's HAMR Time
Hard drives as we know them are an example of the end result of a finely-tuned manufacturing process. Many of the problems that have plagued these devices over the past five years have either been fixed or mitigated. Comparatively noticeable changes in HDD production, such as the switch to the production of helium- filled drives,have not yet had a serious impact on their failure rate. Other changes such as the move from Perpendicular Magnetic Recording ( PMR ) to Heat-Assisted Magnetic Recording ( HAMR) should not significantly affect the lifespan of the hard drives. And this is provided that new technologies do not bring new problems with them.
In general, the technological future of HDD looks, in every sense, rather boring. They will be low-cost, large-capacity storage facilities that can work normally for at least a dozen years. The basic principle of creating hard drives, in particular the magnetization of small sections of the plates, can develop to such a level when the role of these "sections" will be played by individual molecules. And if you add something like HAMR here, it turns out that you can expect a significant increase in the storage life of information on the HDD.
Hard drives have a significant advantage over NAND, which uses tiny capacitors to store charges and uses a method to write data that physically damages those capacitors. The physical limitations of such memory are much more severe than those applicable to hard drives. This leads to the complication of the memory design, for example, to the creation of drives based on memory cells capable of storing four bits (Quad-Level Cell, QLC). When working with such cells, 16 voltage levels must be distinguished. Because of the complexity of the QLC memory, it turns out that the corresponding SSDs in many scenarios are only slightly faster than 5400 rpm hard drives. This is especially true for data access delays.
Outcome
My first hard drive was a 20 or 30 megabyte Seagate in an IBM PS / 2 (386SX). This computer was brought to me from work by my father. They switched to new PCs and probably wanted to free warehouses from old technology. In the days of MS DOS, 20-30 MB was quite enough for the OS, and for a bunch of games, and for WordPerfect 5.1, and much more. Of course, this amount of memory by the end of the 90s looked ridiculous. Then, speaking of hard disks, they no longer operated with megabytes, but gigabytes.
Despite the fact that I have owned many desktops and laptops since then, ironically, the only drive that, so to speak, died in my hands was an SSD. This, as well as publications about hard drives, like the Backblaze reports, gives me a strong confidence that the days when the platters of the last HDD will stop spinning are very far away. Perhaps this forecast will change only when something like 3D XPoint technology allows the creation of sufficiently large and affordable drives. In the meantime, let everything go on as usual.
Have you encountered hard drive failures?