Continuation: disappointing for opinions about static code analyzers

image1.png


It was planned that having written the article "It's a shame for opinions about static code analyzers", we will speak out and calmly let go of the topic. But unexpectedly, this article caused a stormy response. Unfortunately, the discussion went wrong, and now we will make a second attempt to explain our vision of the situation.



Anecdote-analogy



So, it all started with the article " It's a shame for opinions about static code analyzers ". She began to be actively discussed on some resources and this discussion is very similar to the following old anecdote.
We bought a Japanese chainsaw for some harsh Siberian lumberjacks.

The lumberjacks gathered in a circle and decided to test it.

They brought her in, gave her a tree.

"Zipper," said the Japanese saw.

"Oh, fuck ..." - said the lumberjacks.

They slipped her a thicker tree. "Vzh-zh-zhik!" - said the saw.

"Wow, shit!" - said the lumberjacks.

They slipped a thick cedar into her. "VZH-ZH-ZH-ZH-ZH-ZH-ZHIK !!!" - said the saw.

"Wow, fuck !!" - said the lumberjacks.

They slipped her a scrap of iron. "CRACK!" - said the saw.

"Yeah, fuck !!!" - Stern Siberian lumberjacks said reproachfully! And they left to cut the forest with axes ...
One to one story. People looked at the code:



if (A[0] == 0)
{
  X = Y;
  if (A[0] == 0)
    ....
}


And they began to invent situations when it might be justified, which means that the warning of the PVS-Studio analyzer is false-positive. The reasoning went into the course about the change in memory between two checks, arising from:



  • work of parallel streams;
  • signal / interrupt handlers;
  • variable X is a reference to element A [0] ;
  • hardware, such as performing DMA operations;
  • etc.


And having discussed that not all situations the analyzer can understand, they left to chop the forest with axes. That is, they found an excuse why it is possible to continue not to use a static code analyzer in their work.



Our vision of the situation



This approach is counterproductive. An imperfect tool may well be useful, and its use is economically viable.



Yes, any static analyzer generates false positives. And nothing can be done about it. However, this misfortune is greatly exaggerated. In practice, static analyzers can be configured and used in various ways to suppress and work with false positives (see 1 , 2 , 3 , 4 ). Plus, here it is appropriate to recall the article " False positives are our enemies, but may still be your friends ".



However, even this is not the main thing. It makes no sense to consider special cases of exotic code at all!Can you confuse the analyzer with a complex code? Yes, you can. However, for one such case, there will be hundreds of useful analyzer triggers. Many mistakes can be found and corrected at a very early stage. And one or two false alarms can be safely suppressed and no longer pay attention to them.



And again PVS-Studio is right



Here the article could be finished. However, some may consider the previous section not rational considerations, but attempts to hide the weaknesses and shortcomings of the PVS-Studio tool. So you have to continue.



Consider concrete compiled code that includes variable declarations:



void SetSynchronizeVar(int *);

int foo()
{
    int flag = 0;
    SetSynchronizeVar(&flag);

    int X, Y = 1;

    if (flag == 0)
    {
        X = Y;
        if (flag == 0)
            return 1;
    }
    return 2;
}


The PVS-Studio analyzer reasonably issues a warning: V547 Expression 'flag == 0' is always true.



And the analyzer is absolutely right. If someone starts ranting that a variable can change in another thread, in a signal handler, and so on, then he simply does not understand the C and C ++ languages. You can't write that way.



For optimization purposes, the compiler has the right to throw out the second check and will be absolutely right. From a language point of view, a variable cannot change. Its background change is nothing more than Undefined behavior.



For the check to remain in place, the variable must be declared volatile :



void SetSynchronizeVar(volatile int *);

int foo()
{
    volatile int flag = 0;
    SetSynchronizeVar(&flag);
    ....
}


The PVS-Studio analyzer knows about this and no longer issues a warning for such a code .



Here we go back to what was discussed in the first article . There is no problem. But there is criticism or misunderstanding why the analyzer has the right to issue a warning.



Note for the most meticulous readers



Some readers may return to the synthetic example from the first article:



char get();
int foo(char *p, bool arg)
{
    if (p[1] == 1)
    {
        if (arg)
            p[0] = get();
        if (p[1] == 1)          // Warning
            return 1;
    }
    // ....
    return 3;
}


And add volatile :



char get();
int foo(volatile char *p, bool arg)
{
    if (p[1] == 1)
    {
        if (arg)
            p[0] = get();
        if (p[1] == 1)          // Warning :-(
            return 1;
    }
    // ....
    return 3;
}


After that, it is fair to say that the analyzer still issues the warning V547 Expression 'p [1] == 1' is always true.



Hooray, it was finally shown that the analyzer is still wrong :). This is a false positive!



As you can see, we are not hiding any flaws. When parsing the data stream for array elements, this ill-fated volatile got lost. The flaw has already been found and fixed. The fix will be available in the next version of the analyzer. There will be no false positives.



Why wasn't this bug identified earlier? Because in fact, this is again unreal code that is not found in real projects. Actually, we haven’t come across such code so far, although we have checked many open source projects .



Why is the code unrealistic? First, in practice, there will be some kind of timing or delay function between the two checks. Secondly, no one in their right mind, unless absolutely necessary, creates arrays consisting of volatile elements. Working with such an array is a colossal drop in performance.



Let's summarize. It is easy to create examples where the parser fails. But from a practical point of view, the identified flaws practically do not affect the quality of the code analysis and the number of real errors detected. After all, the code of real applications is just code that is understandable by the analyzer and the person at the same time, and not a rebus or puzzle. If the code is a puzzle, then there is no time for analyzers :).



Thank you for attention.





Additional links









If you want to share this article with an English-speaking audience, please use the translation link: Andrey Karpov. Part 2: Upsetting Opinions about Static Analyzers .



All Articles