🎪 👨🏼‍🏭 👨‍👧‍👧 Checking QEMU with PVS-Studio 🌵 👶🏾 🏺

QEMU is a fairly well-known emulation application. Static analysis can help developers of complex projects like QEMU catch errors at an early stage and generally improve its quality and reliability. In this article, the source code of the QEMU application will be checked for potential vulnerabilities and errors using the PVS-Studio static analysis tool.

QEMU is free software designed to emulate hardware across platforms. It allows you to run applications and operating systems on hardware platforms other than the target, for example, an application written for MIPS to run for the x86 architecture. QEMU also supports emulation of a variety of peripherals such as video cards, usb, etc. The project is quite complex and worthy of attention, it is such projects that are of interest for static analysis, so it was decided to check its code using PVS-Studio.

About analysis

The source code of the project can be obtained from a mirror on github . The project is quite large and can be compiled for various platforms. For easier code checking, we will use the PVS-Studio compilation monitoring system . This system is designed for very easy integration of static analysis into almost any build platform. The system is based on tracking compiler calls during build and allows you to collect all the information for subsequent analysis of files. In other words, we just start the build, PVS-Studio collects the necessary information, and then we start the analysis - everything is simple. Details can be found at the link above.

After checking, the analyzer found many potential problems. For general-purpose diagnostics (General Analysis), it was obtained: 1940 High, 1996 Medium, 9596 Low. After reviewing all the warnings, it was decided to focus on diagnostics for the first confidence level (High). Quite a lot of such warnings were found (1940), but most of the warnings are either of the same type or associated with repeated use of a suspicious macro. For example, consider the macro g_new .

#define g_new(struct_type, n_structs)
                        _G_NEW (struct_type, n_structs, malloc)

#define _G_NEW(struct_type, n_structs, func)       \
  (struct_type *) (G_GNUC_EXTENSION ({             \
    gsize __n = (gsize) (n_structs);               \
    gsize __s = sizeof (struct_type);              \
    gpointer __p;                                  \
    if (__s == 1)                                  \
      __p = g_##func (__n);                        \
    else if (__builtin_constant_p (__n) &&         \
             (__s == 0 || __n <= G_MAXSIZE / __s)) \
      __p = g_##func (__n * __s);                  \
    else                                           \
      __p = g_##func##_n (__n, __s);               \
    __p;                                           \
  }))

For each use of this macro, the analyzer issues warning V773 (Visibility scope of the '__p' pointer was exited without releasing the memory. A memory leak is possible). The g_new macro is defined in the glib library, it uses the _G_NEW macro , and this macro in turn uses another macro, G_GNUC_EXTENSION , which tells the GCC compiler to skip the warnings about non-standard code. It is this non-standard code that causes the analyzer's warning, pay attention to the penultimate line. In general, the macro is working. There were 848 warnings of this type, that is, almost half of the alerts occur in just one single place in the code.

All these unnecessary warnings are easy to removeusing the analyzer settings. However, this particular case, which we encountered while writing this article, is the reason for our team to slightly modify the analyzer logic for such situations.

Thus, a large number of warnings does not always mean bad code quality. However, there are some really suspicious places. Well, let's get down to the warnings.

Warning N1

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 2395, 2397. megasas.c 2395

#define MEGASAS_MAX_SGE 128             /* Firmware limit */
....
static void megasas_scsi_realize(PCIDevice *dev, Error **errp)
{
  ....
  if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
    ....
  } else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
    ....
  }
  ....
}

Any use of "magic" numbers in the code is always suspicious. There are two conditions here, and at first glance they seem to be different, but if you look at the value of the MEGASAS_MAX_SGE macro , it turns out that the conditions duplicate each other. Most likely, there is a typo here and instead of 128 there should be another number. Of course, this is the problem of all "magic" numbers, it is enough just to be sealed when using them. The use of macros and constants greatly helps the developer in this case.

Warning N2

V523 The 'then' statement is equivalent to the 'else' statement. cp0_helper.c 383

target_ulong helper_mftc0_cause(CPUMIPSState *env)
{
  ....
  CPUMIPSState *other = mips_cpu_map_tc(env, &other_tc);

  if (other_tc == other->current_tc) {
    tccause = other->CP0_Cause;
  } else {
    tccause = other->CP0_Cause;
  }
  ....
}

In the code under consideration, the then and else bodies of the conditional operator are identical. Here, most likely copy-paste. Just copy the body then the branch , and fix forgotten. It can be assumed that env should have been used instead of the other object . The fix for this suspicious place might look like this:

if (other_tc == other->current_tc) {
  tccause = other->CP0_Cause;
} else {
  tccause = env->CP0_Cause;
}

Only the developers of this code can unequivocally say how it should actually be. Another similar place:

V523 The 'then' statement is equivalent to the 'else' statement. translate.c 641

Warning N3

V547 Expression 'ret <0' is always false. qcow2-cluster.c 1557

static int handle_dependencies(....)
{
  ....
  if (end <= old_start || start >= old_end) {
    ....
  } else {

    if (bytes == 0 && *m) {
      ....
      return 0;           // <= 3
    }

    if (bytes == 0) {
      ....
      return -EAGAIN;     // <= 4
    }
  ....
  }
  return 0;               // <= 5
}

int qcow2_alloc_cluster_offset(BlockDriverState *bs, ....)
{
  ....
  ret = handle_dependencies(bs, start, &cur_bytes, m);
  if (ret == -EAGAIN) {   // <= 2
    ....
  } else if (ret < 0) {   // <= 1
    ....
  }
}

Here the analyzer found that the condition (comment 1) would never be met. The value of the ret variable is initialized by the result of executing the handle_dependencies function , this function only returns 0 or -EAGAIN (comments 3, 4, 5). A little higher, in the first condition, we checked the value of ret against -EAGAIN (comment 2), so the result of the expression ret <0 will always be false. Perhaps the handle_dependencies function used to return different values, but later, as a result of, for example, refactoring, the behavior changed. Here you just need to complete the refactoring. Similar triggers:

V547 Expression is always false. qcow2.c 1070
V547 Expression 's-> state! = MIGRATION_STATUS_COLO' is always false. colo.c 595
V547 Expression 's-> metadata_entries.present & 0x20' is always false. vhdx.c 769

Warning N4

V557 Array overrun is possible. The 'dwc2_glbreg_read' function processes value '[0..63]'. Inspect the third argument. Check lines: 667, 1040.hcd-dwc2.c 667

#define HSOTG_REG(x) (x)                                             // <= 5
....
struct DWC2State {
  ....
#define DWC2_GLBREG_SIZE    0x70
  uint32_t glbreg[DWC2_GLBREG_SIZE / sizeof(uint32_t)];              // <= 1
  ....
}
....
static uint64_t dwc2_glbreg_read(void *ptr, hwaddr addr, int index,
                                 unsigned size)
{
  ....
  val = s->glbreg[index];                                            // <= 2
  ....
}
static uint64_t dwc2_hsotg_read(void *ptr, hwaddr addr, unsigned size)
{
  ....
  switch (addr) {
    case HSOTG_REG(0x000) ... HSOTG_REG(0x0fc):                      // <= 4
        val = dwc2_glbreg_read(ptr, addr,
                              (addr - HSOTG_REG(0x000)) >> 2, size); // <= 3
    ....
  }
  ....
}

This code has a potential problem with array overrun. In the structure DWC2State defined array glbreg , consisting of 28 elements (comment 1). In the dwc2_glbreg_read function, we refer to our array by index (comment 2). Now, note that the expression ( addr - HSOTG_REG (0x000)) >> 2 (comment 3) is passed as an index to the dwc2_glbreg_read function , which can take a value in the range [0..63]. In order to be convinced of this, pay attention to comments 4 and 5. Perhaps, here it is necessary to adjust the range of values from comment 4. More similar triggers:

V557 Array overrun is possible. The 'dwc2_hreg0_read' function processes value '[0..63]'. Inspect the third argument. Check lines: 814, 1050.hcd-dwc2.c 814
V557 Array overrun is possible. The 'dwc2_hreg1_read' function processes value '[0..191]'. Inspect the third argument. Check lines: 927, 1053.hcd-dwc2.c 927
V557 Array overrun is possible. The 'dwc2_pcgreg_read' function processes value '[0..127]'. Inspect the third argument. Check lines: 1012, 1060.hcd-dwc2.c 1012

Warning N5

V575 The 'strerror_s' function processes '0' elements. Inspect the second argument. commands-win32.c 1642

void qmp_guest_set_time(bool has_time, int64_t time_ns, 
                        Error **errp)
{
  ....
  if (GetLastError() != 0) {
    strerror_s((LPTSTR) & msg_buffer, 0, errno);
    ....
  }
}

The strerror_s function returns a textual description of the system error code. Its signature looks like this:

errno_t strerror_s( char *buf, rsize_t bufsz, errno_t errnum );

The first parameter is a pointer to the buffer where the text description will be copied, the second parameter is the buffer size, the third is the error code. In the code, 0 is passed as the size of the buffer, this is clearly an erroneous value. By the way, it is possible to know in advance how many bytes should be allocated: you just need to call strerrorlen_s , which returns the length of the text description of the error. This value can be used to allocate a buffer of sufficient size.

Warning N6

V595 The 'blen2p' pointer was utilized before it was verified against nullptr. Check lines: 103, 106.dsound_template.h 103

static int glue (
    ....
    DWORD *blen1p,
    DWORD *blen2p,
    int entire,
    dsound *s
    )
{
  ....
  dolog("DirectSound returned misaligned buffer %ld %ld\n",
        *blen1p, *blen2p);                         // <= 1
  glue(.... p2p ? *p2p : NULL, *blen1p,
                            blen2p ? *blen2p : 0); // <= 2
....
}

In this code, the value of the blen2p argument is first used (comment 1) and then checked for nullptr (comment 2). This highly suspicious place looks like you just forgot to insert the check before first use (comment 1). As a fix, just add a check:

dolog("DirectSound returned misaligned buffer %ld %ld\n",
      *blen1p, blen2p ? *blen2p : 0);

There is still a question about the blen1p argument . Probably, it can also be a null pointer, and here you will also need to add a check. Several more similar positives:

V595 The 'ref' pointer was utilized before it was verified against nullptr. Check lines: 2191, 2193.uri.c 2191
V595 The 'cmdline' pointer was utilized before it was verified against nullptr. Check lines: 420, 425.qemu-io.c 420
V595 The 'dp' pointer was utilized before it was verified against nullptr. Check lines: 288, 294. onenand.c 288
V595 The 'omap_lcd' pointer was utilized before it was verified against nullptr. Check lines: 81, 87. omap_lcdc.c 81

Warning N7

V597 The compiler could delete the 'memset' function call, which is used to flush 'op_info' object. The RtlSecureZeroMemory () function should be used to erase the private data. virtio-crypto.c 354

static void virtio_crypto_free_request(VirtIOCryptoReq *req)
{
  if (req) {
    if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
      ....
      /* Zeroize and free request data structure */
      memset(op_info, 0, sizeof(*op_info) + max_len); // <= 1
      g_free(op_info);
    }
    g_free(req);
  }
}

In this code fragment, the memset function is called for the op_info object (comment 1), after which the op_info is immediately deleted, that is, in other words, after cleaning this object is not modified anywhere else. This is exactly the case when the compiler can remove the memset call during the optimization process . To eliminate this potential behavior, you can use special functions that the compiler never removes. See also the article " Clearing Private Data Safely ".

N8

V610 Unspecified behavior warning . Check the shift operator '>>'. The left operand is negative ('number' = [-32768..2147483647]). cris.c 2111

static void
print_with_operands (const struct cris_opcode *opcodep,
         unsigned int insn,
         unsigned char *buffer,
         bfd_vma addr,
         disassemble_info *info,
         const struct cris_opcode *prefix_opcodep,
         unsigned int prefix_insn,
         unsigned char *prefix_buffer,
         bfd_boolean with_reg_prefix)
{
  ....
  int32_t number;
  ....
  if (signedp && number > 127)
    number -= 256;            // <= 1
  ....
  if (signedp && number > 32767)
    number -= 65536;          // <= 2
  ....
  unsigned int highbyte = (number >> 24) & 0xff;
  ....
}

Since the variable number can be negative, the bitwise right shift is unspecified behavior. To make sure that the variable in question can take a negative value, see comments 1 and 2. To eliminate the differences in the behavior of your code on different platforms, such cases should be avoided.

More warnings:

V610 Undefined behavior. Check the shift operator '<<'. The left operand is negative ('(hclk_div - 1)' = [-1..15]). aspeed_smc.c 1041
V610 Undefined behavior. Check the shift operator '<<'. The left operand '(target_long) - 1' is negative. exec-vary.c 99
V610 Undefined behavior. Check the shift operator '<<'. The left operand is negative ('hex2nib (words [3] [i * 2 + 2])' = [-1..15]). qtest.c 561

There are also several warnings of the same type, only -1 is used as the left operand .

V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. hppa.c 2702

int print_insn_hppa (bfd_vma memaddr, disassemble_info *info)
{
  ....
  disp = (-1 << 10) | imm10;
  ....
}

Other similar warnings:

V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. hppa.c 2718
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-0x8000' is negative. fmopl.c 1022
V610 Undefined behavior. Check the shift operator '<<'. The left operand '(intptr_t) - 1' is negative. sve_helper.c 889

Warning N9

V616 The 'TIMER_NONE' named constant with the value of 0 is used in the bitwise operation. sys_helper.c 179

#define HELPER(name) ....

enum {
  TIMER_NONE = (0 << 30),        // <= 1
  ....
}

void HELPER(mtspr)(CPUOpenRISCState *env, ....)
{
  ....
  if (env->ttmr & TIMER_NONE) {  // <= 2
    ....
  }
}

You can easily verify that the value of the TIMER_NONE macro is zero (comment 1). Further, this macro is used in a bitwise operation, the result of which will always be 0. As a result, the body of the conditional statement if (env-> ttmr & TIMER_NONE) will never be executed.

Warning N10

V629 Consider inspecting the 'n << 9' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. qemu-img.c 1839

#define BDRV_SECTOR_BITS   9
static int coroutine_fn convert_co_read(ImgConvertState *s, 
                  int64_t sector_num, int nb_sectors, uint8_t *buf)
{
  uint64_t single_read_until = 0;
  int n;
  ....
  while (nb_sectors > 0) {
    ....
    uint64_t offset;
    ....
    single_read_until = offset + (n << BDRV_SECTOR_BITS);
    ....
  }
  ....
}

In this code fragment , a shift operation is performed on the variable n , which has a 32-bit signed type, then this 32-bit signed result is expanded to a 64-bit signed type, and then, as an unsigned type, is added to the unsigned 64-bit variable offset . Suppose that at the time the expression is executed, the variable n has some significant most significant 9 bits. We are performing a 9-bit shift operation ( BDRV_SECTOR_BITS), and this, in turn, is undefined behavior, then as a result we can get the set bit in the most significant bit. Recall that this bit in the signed type is responsible for the sign, that is, the result can become negative. Since n is a signed variable, the sign will be taken into account when expanding. The result is then added to the offset variable . From these considerations, it is easy to see that the result of executing the expression may differ from the intended one. One of the possible solutions is to replace the type of the variable n with a 64-bit unsigned type, that is, with uint64_t .

Here are some more similar triggers:

V629 Consider inspecting the '1 << refcount_order' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. qcow2.c 3204
V629 Consider inspecting the 's->cluster_size << 3' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. qcow2-bitmap.c 283
V629 Consider inspecting the 'i << s->cluster_bits' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. qcow2-cluster.c 983
V629 Consider inspecting the expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. vhdx.c 1145
V629 Consider inspecting the 'delta << 2' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. mips.c 4341

Warning N11

V634 The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression. nand.c 310

static void nand_command(NANDFlashState *s)
{
  ....
  s->addr &= (1ull << s->addrlen * 8) - 1;
  ....
}

It's just a suspicious place. It is not clear what the programmer wanted to do in the beginning: shift or multiplication. Even if there is no mistake here, you still need to look at the code again and place the brackets correctly. This is just one of those places that developers should look at to make sure their algorithm is correct. Other such places:

V634 The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression. exynos4210_mct.c 449
V634 The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression. exynos4210_mct.c 1235
V634 The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression. exynos4210_mct.c 1264

Warning N12

V646 Consider inspecting the application's logic. It's possible that 'else' keyword is missing. pl181.c 400

static void pl181_write(void *opaque, hwaddr offset,
                        uint64_t value, unsigned size)
{
  ....
  if (s->cmd & PL181_CMD_ENABLE) {
    if (s->cmd & PL181_CMD_INTERRUPT) {
      ....
    } if (s->cmd & PL181_CMD_PENDING) { // <= else if
      ....
    } else {
      ....
    }
    ....
  }
  ....
}

In this code, judging by the formatting, the use of else if instead of if directly suggests itself . Perhaps they forgot to add else here . Then the correction option could be like this:

} else if (s->cmd & PL181_CMD_PENDING) { // <= else if

However, there is a chance that everything is in order with this code, and there is incorrect formatting of the program text, which is confusing. Then the code might look like this:

if (s->cmd & PL181_CMD_INTERRUPT) {
  ....
}
if (s->cmd & PL181_CMD_PENDING) { // <= if
  ....
} else {
  ....
}

Warning N13

V773 The function was exited without releasing the 'rule' pointer. A memory leak is possible. blkdebug.c 218

static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
{
  ....
  struct BlkdebugRule *rule;
  ....
  rule = g_malloc0(sizeof(*rule));                   // <= 1
  ....
  if (local_error) {
    error_propagate(errp, local_error);
    return -1;                                       // <= 2
  }
  ....
  /* Add the rule */
  QLIST_INSERT_HEAD(&s->rules[event], rule, next);   // <= 3
  ....
}

In this code, the rule object (comment 1) is selected and added to the list for later use (comment 3), but in case of an error, the function returns without deleting the previously created rule object (comment 2). Here you just need to correctly handle the error: delete the previously created object, otherwise there will be a memory leak.

Warning N14

V781 The value of the 'ix' index is checked after it was used. Perhaps there is a mistake in program logic. uri.c 2110

char *uri_resolve_relative(const char *uri, const char *base)
{
  ....
  ix = pos;
  if ((ref->path[ix] == '/') && (ix > 0)) {
  ....
}

Here the analyzer has detected a potential out of bounds array. First, the element of the ref-> path array at index ix is read , and then ix is checked for correctness ( ix> 0 ). The correct solution here is to reverse these actions:

if ((ix > 0) && (ref->path[ix] == '/')) {

There were several such places:

V781 The value of the 'ix' index is checked after it was used. Perhaps there is a mistake in program logic. uri.c 2112
V781 The value of the 'offset' index is checked after it was used. Perhaps there is a mistake in program logic. keymaps.c 125
V781 The value of the 'quality' variable is checked after it was used. Perhaps there is a mistake in program logic. Check lines: 326, 335.vnc-enc-tight.c 326
V781 The value of the 'i' index is checked after it was used. Perhaps there is a mistake in program logic. mem_helper.c 1929

Warning N15

V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. cadence_gem.c 1486

typedef struct CadenceGEMState {
  ....
  uint32_t regs_ro[CADENCE_GEM_MAXREG];
}
....
static void gem_write(void *opaque, hwaddr offset, uint64_t val,
        unsigned size)
{
  ....
  val &= ~(s->regs_ro[offset]);
  ....
}

This code performs a bitwise operation on objects of different types. The left operand is the argument val , which is a 64-bit unsigned type. The received value of the array element s-> regs_ro at the offset index , which has a 32-bit unsigned type, is used as the right operand . The result of the operation on the right-hand side (~ (s-> regs_ro [offset])) is a 32-bit unsigned type, and before bitwise multiplication it will expand to 64-bit type with zeros, that is, after evaluating the entire expression, all high-order bits of the val variable will be zeroed . Such places always look suspicious. Here we can only recommend the developers to revise this code again. More similar:

V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. xlnx-zynq-devcfg.c 199
V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. soc_dma.c 214
V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. fpu_helper.c 418

Warning N16

V1046 Unsafe usage of the 'bool' and 'unsigned int' types together in the operation '& ='. helper.c 10821

static inline uint32_t extract32(uint32_t value, int start, int length);
....
static ARMVAParameters aa32_va_parameters(CPUARMState *env, uint32_t va,
                                          ARMMMUIdx mmu_idx)
{
  ....
  bool epd, hpd;
  ....
  hpd &= extract32(tcr, 6, 1);
}

In this code fragment, a bitwise AND operation is performed on the hpd variable of type bool and the result of executing the extract32 function , which has the type uint32_t . Since the bit value of a boolean variable can only be 0 or 1, the result of the expression will always be false if the least significant bit returned by the extract32 function is zero. Let's look at this with an example. Suppose that hpd is true and the function returns 2, that is, in binary representation, the operation will look like 01 & 10 = 0, and the result of the expression will be false . Most likely, the programmer wanted to set the value to trueif the function returns something nonzero. Apparently, the code needs to be corrected so that the result of the function is cast to the bool type , for example, like this:

hpd = hpd && (bool)extract32(tcr, 6, 1);

Conclusion

As you can see, the analyzer found many suspicious places. Perhaps the potential problems found have not yet manifested themselves, but their presence cannot but be alarming, since they are capable of firing at the most unexpected moment. Viewing all suspicious places in advance and fixing them is better than fixing an endless stream of bugs. Obviously, for complex projects like this, static analysis can bring tangible benefits, especially if you organize a regular project review. If you want to try PVS-Studio for your project, you can download the analyzer and get a free trial key on this page.

If you want to share this article with an English-speaking audience, please use the translation link: Evgeniy Ovsannikov. Checking QEMU using PVS-Studio .

Checking QEMU with PVS-Studio

About analysis

Conclusion

More articles: