Anyone can make a Rockstar mistake (and so can I.)



A few months ago, an amazing article [translations to Habré: one and second ] about Grand Theft Auto Online surfaced in the news .



I advise you to read the entire article, but in short, GTA Online suddenly had quadratic performance when parsing a large JSON blob (due to multiple calls strlen



); after fixing this error, loading time decreased by almost 70%.



This sparked a lively debate : Is C to blame for this? Or maybe "web shit" ? Or capitalism and its incentives ?



However, everyone agreed on one thing : they would never have written such nonsense.



( Can you already feel what is coming? )




One of my side projects is a high performance 3D model viewer called Erizo .





Thanks to clever code, it opens a 97MB STL binary on a 2013 Macbook Pro in just 165 milliseconds. This is amazing speed.



For compatibility reasons, I wrote a small parser for ASCII STL as well .



ASCII STL is a poorly specified plain text format that looks like this:



solid cube_corner
          facet normal 0.0 -1.0 0.0
            outer loop
              vertex 0.0 0.0 0.0
              vertex 1.0 0.0 0.0
              vertex 0.0 0.0 1.0
            endloop
          endfacet
          facet normal 0.0 0.0 -1.0
            outer loop
              vertex 0.0 0.0 0.0
              vertex 0.0 1.0 0.0
              vertex 1.0 0.0 0.0
            endloop
          endfacet
          ...
endsolid
      
      





I wrote an extremely robust parser with a comment like this:



/*     ASCII STL:  , 
 *   'vertex',         float. */
      
      





Loading the ASCII STL always seemed a little slow, but I assumed it was due to an inefficient text format.



(The clouds are gathering. )






Several events happened in a few days:





Here are the download logs of the 1.5MB ASCII STL timestamped (in seconds):



[erizo] (0.000000) main.c:10      | Startup!
[erizo] (0.162895) window.c:91    | Created window
[erizo] (0.162900) window.c:95    | Made context current
[erizo] (0.168715) window.c:103   | Initialized GLEW
[erizo] (0.178329) window.c:91    | Created window
[erizo] (0.178333) window.c:95    | Made context current
[erizo] (1.818734) loader.c:109   | Parsed ASCII STL
[erizo] (1.819471) loader.c:227   | Workers have deduplicated vertices
[erizo] (1.819480) loader.c:237   | Got 5146 vertices (7982 triangles)
[erizo] (1.819530) loader.c:240   | Waiting for buffer...
[erizo] (1.819624) loader.c:326   | Allocated buffer
[erizo] (1.819691) loader.c:253   | Sent buffers to worker threads
[erizo] (1.819883) loader.c:258   | Joined worker threads
[erizo] (1.819887) loader.c:279   | Loader thread done
[erizo] (1.821291) instance.c:32  | Showed window
      
      





It took more than 1.8 seconds from the start to display the window!



Taking a fresh look at the ASCII parser, I saw that the reason is obvious:



    /*  The most liberal ASCII STL parser:  Ignore everything except
     *  the word 'vertex', then read three floats after each one. */
    const char VERTEX_STR[] = "vertex ";
    while (1) {
        data = strstr(data, VERTEX_STR);
        if (!data) {
            break;
        }

        /* Skip to the first character after 'vertex' */
        data += strlen(VERTEX_STR);

        for (unsigned i=0; i < 3; ++i) {
            SKIP_WHILE(isspace);
            float f;
            const int r = sscanf(data, "%f", &f);
            ABORT_IF(r == 0 || r == EOF, "Failed to parse float");
            if (buf_size == buf_count) {
                buf_size *= 2;
                buffer = (float*)realloc(buffer, buf_size * sizeof(float));
            }
            buffer[buf_count++] = f;

            SKIP_WHILE(!isspace);
        }
    }
      
      





You can notice that in the code there is one sscanf



that reads one float value from the beginning of the data stream and each time checks the length of the entire string .



Yes, I made the same mistake as the programmers who worked on GTA Online: I suddenly wrote a quadratic parser!



Replacing a call sscanf



with a call strtof



reduced the loading time by almost 10 times: from 1.8 seconds to 199 milliseconds.



[erizo] (0.000000) main.c:10      | Startup!
[erizo] (0.178082) window.c:91    | Created window
[erizo] (0.178086) window.c:95    | Made context current
[erizo] (0.184226) window.c:103   | Initialized GLEW
[erizo] (0.194469) window.c:91    | Created window
[erizo] (0.194472) window.c:95    | Made context current
[erizo] (0.196126) loader.c:109   | Parsed ASCII STL
[erizo] (0.196866) loader.c:227   | Workers have deduplicated vertices
[erizo] (0.196871) loader.c:237   | Got 5146 vertices (7982 triangles)
[erizo] (0.196921) loader.c:240   | Waiting for buffer...
[erizo] (0.197013) loader.c:326   | Allocated buffer
[erizo] (0.197082) loader.c:253   | Sent buffers to worker threads
[erizo] (0.197303) loader.c:258   | Joined worker threads
[erizo] (0.197306) loader.c:279   | Loader thread done
[erizo] (0.199328) instance.c:32  | Showed window
      
      








It's a perfect reminder that even if you've been programming for many years, there are always traps . In sscanf



not filled its time complexity, so it is very cunning gun shot himself in the foot, and it seems to me that no one I wandered in the darkness of ignorance.



You may not come across a reminder like this yourself, but whenever you read an awesome story about bad code, remember - it can happen to you too!



(Obviously, the moral of the story is this: don't use sscanf



single tokens from the beginning of a line to pars repeatedly; I'm sure you'll be fine if you just avoid it.)






Advertising



VDSina offers powerful and affordable VPS with daily pay. The Internet channel for each server is 500 Megabits, protection against DDoS attacks is included in the tariff, the ability to install Windows, Linux or the OS in general from your image, and also a very convenient proprietary server control panel . Try it!






All Articles