Seamless A / B Updates in Android: How They Work

image



Hello. At SberDevices, our team is engaged in the development of various hardware and firmware for them based on AOSP.



Starting from Android 8 (some vendors from 7.1), the system has a new mechanism for rolling OTA updates, the so-called. Seamless A / B OTA Updates - seamless updates. In this post I will describe the general principles of its operation, consider the mechanism from the developer's point of view, and also compare it with the old (we will call it recovery-based) approach to applying updates. All of the following will be true only for pure AOSP, since the specific implementation depends on the vendor.



Recovery-based OTA



Updates for Android are delivered in the form of a zip archive with block-based updates. In the days of KitKat, it was just a set of files that were copied to the device by the included script. I will not dwell on this mode in detail, I will briefly describe the basic principles of its operation:



  • zip archive is downloaded by the system to the device;
  • the system reboots into recovery mode ;
  • checking the compatibility of the update with the device, its signature;
  • if everything is OK, the updater-script from the zip archive is executed ;
  • during the update, the device may reboot several times (for example, to update the device tree );
  • if everything went well, boot into the new firmware.


What are the disadvantages of this scheme?



  • The need to reserve a sufficient amount of internal memory for the OTA archive. The / cache section is used for this . Some vendors use / data , but this is rare. As a result, the user is left with less space (yes, applications can still use space in the / cache partition , but with some restrictions).
  • Rebooting and applying the update takes time, which can be critical for some types of devices, for example, for Smart TVs.
  • Interrupting the update process may result in a boot loop .
  • There is no way to roll back to the old firmware version.


This inconvenience allows you to bypass the seamless upgrade method. Let's see how it works.



Seamless A / B OTA



Key components and mechanisms required to implement seamless A / B updates :



  • slot markup of flash memory; 
  • interaction with the loader, managing the state of slots ;
  • system daemon update_engine ;
  • generating a zip archive with an update. This aspect will not be considered in this article.


Slots



The basic principle of A / B OTA is  slotting . All partitions that need to be updated (these can be any partitions, not just system ones) must be in two copies or, otherwise, in slots. The Android implementation supports 2 slots, which are named A and B, respectively . The system boots and works from the current slot, the second is used only at the time of the update. A suffix with the slot name is added to the section name.



Below is a table comparing the two options for organizing partitions on a device. All slotted partitions are marked with the slotselect mount option  so that the system can select the correct slot. Depending on where they are described, this could be fstab

image



or dts .



Changes to the partition table



  • B / cache is no longer needed. Now the update can be saved either in / data , or immediately flashed into an inactive slot (more on that below). 
  • The recovery section is also no longer used. However,  the  recovery mode  still exists, it is necessary, for example, to reset the device to factory settings (this can lead to a  rescue party ). Or for the so-called. manual update ( sideload ) via adbRecovery ramdisk is  now inside the  boot partition, the kernel is shared.
  • (android/recovery) cmdline ‑ skip_initramfs.


At first glance, it seems that such a scheme is not optimal, since it is necessary to allocate twice as much space for the system. But we got rid of  / cache , which means we have already saved a lot of memory. Thus, the system will take a little more than the recovery option .



The main advantage of A / B updates is the ability to  stream the firmware. It is it that ensures seamless and transparent updates for the user: to update the device, it is enough to reboot into a new slot. In this mode, there is no need to download the zip archive in advance, taking up space in / data . Instead, the system immediately writes data blocks from a specially prepared file ( payload, see below) into each section of an inactive slot. From the point of view of implementation, it makes no difference whether we download the update in advance or immediately stream it to the slot.



Slots have the following states:



  • active - active slot, the system will be loaded from it at the next reboot;
  • bootable - the update was successfully flashed into the slot, was validated, hash sums matched, etc .;
  • successful - the system was able to successfully boot into the new slot;
  • unbootable - the slot is damaged. The system always marks the slot as unbootable before starting the upgrade process.


Both slots can be  bootable  and  successful , but only one is active .



The algorithm of the bootloader when choosing a slot:

image

  • The bootloader detects that there are one or more  bootable slots .
  • The active slot (or the slot with the highest priority) is selected from them.
  • If the system boots successfully, the slot is marked as  successful  and  active .
  • Otherwise, the slot is marked as unbootable and the system reboots.




Changing slot states during update:

image



Prerequisites for Seamless A / B.



boot_control



To support A / B updates, the vendor must implement a special HAL interface - boot_control . It allows you to change the states of slots and get information about them. For external work (for example, via adb shell ), the utility is used - bootctl . The interface is used as a means of communication between the OS and the bootloader.



update_engine



The main component of the entire A / B circuit. Handles downloading, streaming updates, verifying signatures, and more. Changes slot states via boot_control . Allows you to control the process of updating the device: pause, resume, cancel.

The component came to Android from ChromeOS, where it has been in use for a while. AOSP supports update_engine as a static sideload assembly. This is what is used in recovery , because this mode does not support dynamic linking.



The process of this component can be divided into the following steps:



  • loading the update into the slot. You can download both from a previously downloaded package with an update, or directly over the Internet via http / https. During the download, the signature is checked, the public key is already on the device (/system/etc/update_engine/update-payload-key.pub.pem);
  • verification of the downloaded update and comparison of hash sums;
  • execution of post-install scripts. 


Service pack structure:



2009-01-01 00:00:00 .....          360          360  META-INF/com/android/metadata
2009-01-01 00:00:00 .....          107          107  care_map.txt
2009-01-01 00:00:00 .....    384690699    384690699  payload.bin
2009-01-01 00:00:00 .....          154          154  payload_properties.txt
2009-01-01 00:00:00 .....         1675          943  META-INF/com/android/otacert


  • care_map.txt - used by update_verifier (see below);
  • payload_properties.txt - contains hashes and data sizes inside payload ;
  • payload.bin - update package, contains blocks of all sections, metadata , signature.


update_engine_client



Client for managing the update_engine daemon . Can be called directly by the vendor to apply the update.



update_verifier



Utility to check the integrity of the system at the first start (slot with the active flag  , but not yet  successful ). Integrity control is implemented using the dm-verity kernel module . If the check is successful, the utility marks the current slot as  successful . Otherwise, the system will reboot into the old slot. Only the blocks specified in the care_map.txt file are verified .



UpdateEngineApi



There is a Java API for implementing vendor update services . There is also an example of such a service implementation .



Let's look at an example A / B update build in AOSP. To do this, edit the Makefile of the target platform:



#  A/B
AB_OTA_UPDATER := true
#    :
AB_OTA_PARTITIONS := boot system vendor
#  
PRODUCT_PACKAGES := update_engine update_engine_client update_verifier
#  recovery
TARGET_NO_RECOVERY := true
#,       cache:
#BOARD_CACHEIMAGE_PARTITION_SIZE := ...
#BOARD_CACHEIMAGE_FILE_SYSTEM_TYPE := ...


After calling make otapackage, we get a zip archive with the update. In this form, it is already suitable for sideload mode. We can reboot into recovery and call adb sideload ota.zip . This method is convenient for debugging.



Applying an update from within a production system is usually vendor specific. The easiest way is to upload payload.bin to an http server and call update_engine_client directly .



Call example:



update_engine_client \
--payload=http://path/to/payload.bin \
--update \
--headers=" \
FILE_HASH=ozGgyQEddkI5Zax+Wbjo6I/PCR8PEZka9gGd0nWa+oY= \
FILE_SIZE=282344983 \
METADATA_HASH=GLIKfE6KRwylWMHsNadG/Q8iy5f786WTatvMdBlpOPg= \
METADATA_SIZE=26723"


The content of the payload_properties.txt file is passed to the headers parameter . In logcat, you can see the progress of the update. If you pass the --follow switch , the progress will be duplicated in stdout .



Conclusion



The advantages of the new update mechanism are obvious:



  • the system update takes place in the background without interrupting the user's work. Yes, you still need a reboot (to a new slot), but it will be  much  faster than rebooting in recovery to apply the update;
  • the probability of a boot loop is minimized (no one is immune from errors in the implementation). The update process can be interrupted, it will not affect the active slot in any way;
  • it becomes possible to rollback to the previous firmware version. Even if, for some reason, the update was unsuccessful, the system will simply revert to the old version;
  • thanks to streaming, the device will update faster;
  • depending on the implementation, you can completely exclude the user from the update process.


Of the minuses, I would single out two points:



  • A / B OTA becomes dependent on the current disk layout, because the update occurs while the system is running. That is, it becomes impossible to roll the update with the changed partitions;
  • the relative complexity of the implementation.


And yet, in my opinion, the pros outweigh. By the way, in our recently announced device we are using A / B OTA updates.



All Articles