Comparing C # and C ++ Performance in Image Processing Tasks

There is an opinion that C # has no place in computational tasks, and this opinion is quite reasonable: the JIT compiler is forced to compile and optimize the code on the fly during program execution with minimal delays, it simply does not have the opportunity to spend more computational resources to generate more efficient code , in contrast to the C ++ compiler, which can take minutes and even hours on this matter.





However, in recent years, the efficiency of the JIT compiler has noticeably increased, and a number of useful chips have been brought into the framework itself, for example, intrinsics .





And then I wondered: is it possible in 2020, using .NET 5.0, to write code that would not be much inferior in performance to C ++? It turned out that you can.





Motivation

I am engaged in the development of image processing algorithms, and at a fairly low level. That is, this is not juggling with bricks in Python, but the development of something new and, preferably, productive. Python code takes an unacceptably long time, while using C ++ leads to a decrease in development speed. The optimal balance between productivity and performance for such tasks is achieved using C # and Java. In confirmation of my words - the Fiji project .





Previously, I used C # for prototyping, and I rewrote ready-made algorithms that are critical for performance in C ++, shoved them into the lib and pulled the lib from C #. But in this case, portability suffered, and it was not very convenient to debug the code.





But that was a long time ago, since then .NET has stepped far forward, and I wondered if I could abandon the native C ++ library and switch entirely to C #?





Scenario

I will compare languages ​​using the example of basic image processing methods: sum of images, rotation, convolution, median filtering. It is these methods that most often have to be written in C ++. The running time of the convolution is especially critical.





For each of the methods, except for the median filtering, three implementations were made in C # and C ++:





  • Naive implementation using methods like GetPixel (x, y) and SetPixel (x, y, value);





  • Optimized implementation using pointers and working with them at a low level;





  • Intrinsky implementation (AVX).





(Array.Sort, std::sort), , , , . .





, , C# unmanaged - . - , C++ UB , C# - .





Github, , C#:





[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public static void Sum_ThisProperty(NativeImage<float> img1, NativeImage<float> img2, NativeImage<float> res)
{
    for (var j = 0; j < res.Height; j++)
    for (var i = 0; i < res.Width; i++)
        res[i, j] = img1[i, j] + img2[i, j];
}

[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public static void Sum_Optimized(NativeImage<float> img1, NativeImage<float> img2, NativeImage<float> res)
{
    var w = res.Width;

    for (var j = 0; j < res.Height; j++)
    {
        var p1 = img1.PixelAddr(0, j);
        var p2 = img2.PixelAddr(0, j);
        var r = res.PixelAddr(0, j);

        for (var i = 0; i < w; i++)
            r[i] = p1[i] + p2[i];
    }
}

[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public static void Sum_Avx(NativeImage<float> img1, NativeImage<float> img2, NativeImage<float> res)
{
    var w8 = res.Width / 8 * 8;

    for (var j = 0; j < res.Height; j++)
    {
        var p1 = img1.PixelAddr(0, j);
        var p2 = img2.PixelAddr(0, j);
        var r = res.PixelAddr(0, j);

        for (var i = 0; i < w8; i += 8)
        {
            Avx.StoreAligned(r, Avx.Add(Avx.LoadAlignedVector256(p1), Avx.LoadAlignedVector256(p2)));

            p1 += 8;
            p2 += 8;
            r += 8;
        }
        
        for (var i = w8; i < res.Width; i++)
            *r++ = *p1++ + *p2++;
    }
}

      
      







. (1/10 ) 256x256 float 32 bit.









dotnet build -c Release





g++ 10.2.0 -O0





g++ 10.2.0 -O1





g++ 10.2.0 -O2





g++ 10.2.0 -O3





clang 11.0.0 -O2





clang 11.0.0 -O3





Sum (naive)





115.8





757.6





124.4





36.26





19.51





20.14





19.81





Sum (opt)





40.69





255.6





36.07





24.48





19.60





20.11





19.81





Sum (avx)





21.15





60.41





20.00





20.18





20.37





20.23





20.20





Rotate (naive)





90.29





500.3





87.15





36.01





14.49





14.04





14.16





Rotate (opt)





34.99





237.1





35.11





34.17





14.55





14.10





14.27





Rotate (avx)





14.83





51.04





14.14





14.25





14.37





14.22





14.72





Median 3x3





4163





26660





2930





1607





2508





2301





2330





Median 5x5





11550





10090





8240





5554





5870





5610





6051





Median 7x7





23540





24470





17540





13640





12620





12920





13510





Convolve 7x7 (naive)





5519





30900





3240





3694





2775





3047





2761





Convolve 7x7 (opt)





2913





11780





2759





2628





2754





2434





2262





Convolve 7x7 (avx)





709.2





3759





729.8





669.8





684.2





643.8





638.3





Convolve 7x7 (avx*)





505.6





2984





523.4





511.5





507.8





443.2





443.3





: Convolve 7x7 (avx*) - , , .





Core i7-2600K @ 4.0 GHz.





:





  • (avx), C#, , C++. , C# !





  • C# , C# , C++ .





  • C# C++ 2 6 . .





Yes, you can write computational code in C # that has performance parity with C ++. But to do this, you have to resort to manual optimizations in the code: what the C ++ compiler does automatically, in C # you have to do it yourself. Therefore, if you do not have a binding to C #, then write further in C ++.





PS .NET has one killer feature - it is the ability to generate code at runtime. If the image processing pipeline is not known in advance (for example, it is set by the user), then in C ++ you will have to assemble it from bricks and, possibly, even use virtual functions, while in C # you can achieve greater performance simply by generating a method.












All Articles