How to recover data in the HFS + file system

In our overview, we will look at the structural construction of the advanced file system HFS + and its differences from the previous version of HFS. We will also show the procedure for recovering information on media under the control of HFS +.



image



Distinctive feature of HFS +



The distinguishing feature of HFS + is the principle of its operation based on the 32-bit architecture, which replaced the 16-bit one in HFS. Previous addressing had a strong deterrent effect, as it set a limit on the capacity of volumes (65,536 blocks).



So, for example, in a drive with a capacity of one gigabyte, the block size is set equal to sixteen kilobytes. And even the smallest file of one byte took up the entire sixteen bytes.



To store a significant portion of the service information in the HFS + system, as in the previous version of HFS, a B-tree is used.





A single volume on an HFS + system is divided into 512-byte sectors. One or more sectors are collectively combined into a cluster, the final number of which directly depends on the total capacity of the source drive. The new 32-bit addressing option provides direct access to more than 4,294,967,296 clusters, which is much preferred over the 65536 from the previous version. A comparative analysis of the two types of systems shows that they have significant differences. For example, they differ in the length of the file name (31 in HFS versus 255 in HFS +), the encoding used (“Mac Roman” and “Unicode” for HFS and HFS +, respectively), directory nodes (512 bytes and 4 kilobytes for HFS and HFS +), boundary sizes file: 2 ^ 31 versus 2 ^ 63.



Filesystem architecture



Space in the file system is divided into logical blocks called sectors. Basically, they have a value of 512 bytes and are collectively combined into allocation blocks that include one or more sectors. The number of concatenated blocks depends on the total size of the volume.



HFS + is equipped with Big Endian encoding, and the value for its allocation units is 32 bits.



image



HFS + stores service information on disk — metadata files used to organize and manage data placement. The most important of them, which are in demand in the process of data recovery and directly affect the health of the system, are the following elements:



  • Volume Header The header uses Extents and is formatted as a table.
  • Allocation File ( ). Extents .
  • Catalog File ( ). . Extents.
  • Extents Overflow File ( ). .
  • Bad block file ( ). -.
  • StartUp file ( ). .
  • (Journal). , .


In addition to those listed in HFS +, there are other elements. However, the above are of priority importance when there is an urgent need to recover this or that information. We will now look at the fundamental metrics, namely the meaning of the B-Tree and Extents values.



A Brief Explanation of Bi-Tree



HFS + uses a tree-structured data storage structure. The balanced structure of building pages allows you to write a different amount of information in the selected cells of a given boundary volume. The basic principle of the structure is implemented as follows. For example, a file of one hundred megabits must be placed in cells of four kilobytes. The system will place in the first block directly links to all subsequent linked cells, in which all the information will already be written. In addition to data, cells can also contain additional linking links of a new block level. The cells of the tree with links are called nodes. The rest of the elements responsible for storing data are leaves.



image



Extent and Extent Overflow File



Extended records are used by the system to store information about the sectors in which a separate file is located. Usually they are used from zero to eight. Each of the reconstructed records contains an indication of the first information sector storing data and a mark about the total number of occupied clusters. If the file is fragmented too much, it is divided into many fragments and the allocated Extent number is not enough, then the system uses additional extensions (Extent Overflow File) to record the remainder.



Understanding volume header



Volume header is always located in the second sector, if you count from the beginning of the drive. The volume header contains general information about all other building blocks of the system, such as the size of allocation blocks, addresses, etc. In the opposite sector of the drive, namely the second sector, but from the end, the system stores a backup copy of the contents of the Volume header.



image



image



image



Used disk space map



The Allocation file element provides information about all allocation units (empty and full included). They are labeled according to the binary system: "1" is full, "0" is empty. This presentation format is called (bitmap) bitmap. Fragments of a file saved on disk do not always have to be in adjacent cells. Full details about them will be provided in the Volume header.



image



Purpose of the file directory



The tree structure of file storage is quite extensive. It assumes that there is a separate file to record information about the location of folders and files in the drive. As in the earlier version of the system, in HFS + this file is a catalog file. But unlike the original source, its capacity has been significantly increased. The receiver field size in HFS + has become larger, which significantly expands the available capabilities. The size of the field is not tied to a single standard and can be changed based on emerging requirements.



Basically, in the fields, the system stores a small amount of information, the final size of which does not exceed four kilobytes. If the array is larger, then the corresponding Extent extensions are used.



Purpose of the startup file



Mainly the StartUp file is responsible for interacting with different operating systems if they lack the ability to detect HFS + and make it work successfully. Its principle is similar to that of the HFS Boot Blocks.



Bad sectors list



The element contains the system registry. It includes all information about the displaced sectors.



System element Log



Journal is a reserved space on disk media. When the system needs to make changes, the sequence of actions will be as follows. It will first write to the log, and only then will make corrections to the corresponding files. In the event of an unexpected failure, this approach will restore the health of the file system.



The area allocated for the log has a finite size. Therefore, the system regularly updates the information in it, overwriting the existing new data. The rewriting interval for different devices is different and varies from several tens of seconds to tens of minutes.



image



A versatile tool for recovering Time Machine information



The operating system for Mac devices, from Mac OS X Leopard onwards, provides a Time Machine recovery tool. Its main purpose is the mandatory recording of any systemic changes for subsequent safe recovery with the development of negative consequences.



To successfully fulfill the manufacturer's obligations, the data return tool must be provided with a separate medium. It can be external media, internal hard drive, USB storage. Or, you can use an Apple Time Capsule that is specially prepared and designed for Time Machine. It is a network drive that backups are written to before making changes. On initial use, the tool will make a full copy and then save only the changes made.



image



A viable HFS + data recovery option



If, during operation, users need to recover lost data, then it is much more difficult to do it from HFS +. A file system that uses a tree structure to store system information requires constant updating of the Bi-tree after any changes, including deletions. And after such an overwrite, all information about the location of the lost element is immediately erased.



In such cases, users can be helped by universal recovery software . You can use different programs and compare how each of them perform the file check-in operation.



Output



The HFS + file system has come to replace HFS. However, despite the advantages, it is already being replaced by a newer type of system (APFS). Certain HFS + shortcomings raise the issue of recovery. And despite the seeming complexity of the system, it is possible to recover the lost data both with the help of backup and through the use of special programs.



See the source for the full article with all additional video tutorials . If you still have questions, ask them in the comments.



All Articles