How I cut GTA Online load times by 70%

GTA Online. A multiplayer game notorious for slow loading. I recently came back to complete some heists - and I was shocked that it loads just as slowly as it did on the day it was released, 7 years ago.



It's time to get to the bottom of it.



Intelligence service



First I wanted to check if someone had already solved the problem. But I only found stories about the great complexity of the game , which is why it takes so long to load, stories that the network p2p architecture  is garbage (although it is not), some complex ways of loading into story mode, and then into a single session , and a couple more mods to skip the R * logo video at boot time. After reading the forums a little more, I found out that you can save a whopping 10-30 seconds if you use all of these methods together!



Meanwhile, on my computer ...



Benchmark



Scene loading: ~ 1m 10s
Online loading: ~ 6m
No boot menu, from R * logo to gameplay (no Social Club login.

Old, but decent percent: AMD FX-8350
Cheap SSD: KINGSTON SA400S37120G
Need to buy RAM: 2x Kingston 8192 MB (DDR3-1337) 99U5471
Normal GPU: NVIDIA GeForce GTX 1070


I know my hardware is out of date, but the heck, what could slow down my downloads by 6x when online? I could not measure the difference when loading from story mode to online as others have done . Even if it works, the difference is small.



I'm not alone



According to this survey , the problem is widespread enough to be mildly annoying for over 80% of players. Seven years have passed!







I did a little searching for information about those ~ 20% of the lucky ones who load in less than three minutes, and found several benchmarks with top gaming PCs and an online load time of about two minutes. I would have killed someone hacked for such a computer! It really looks like a hardware problem, but something doesn't add up ...



Why does their story mode still take about a minute to load? (by the way, videos with logos were not taken into account when booting from M.2 NVMe). In addition, it only takes them a minute to upload from Story Mode to online, while I have about five. I know that their hardware is much better, but not five times.



High precision measurements



Armed with a powerful tool like Task Manager , I set out to find the bottleneck.







It takes almost a minute to load shared resources, which are needed for both story mode and online (almost on a par with top-end PCs), then GTA completely loads one CPU core for four minutes, doing nothing else.



Disk Usage? Not! Network usage? There is a little, but after a few seconds it drops mainly to zero (except for the loading of rotating information banners). GPU usage? Zero. Memory? Nothing at all ...



What is it, Bitcoin mining or something? I can smell the code here. Very bad code.



Single stream



My old AMD processor has eight cores, and it's still great, but it's an old model. It was made back when AMD's single thread performance was much lower than Intel's. This is probably the main reason for such differences in load times.



What's strange is the way the CPU is used. I was expecting a huge amount of disk reads or a ton of network requests to establish sessions on a p2p network. But is it? There is probably some mistake here.



Profiling



A profiler is a great way to find CPU bottlenecks. There is only one problem - most of them rely on source code instrumentation to get a perfect picture of what is happening in the process. And I don't have the source code. I also don't need perfect microseconds readings, I have a 4 minute bottleneck .



So, welcome to stack sampling. For closed source applications, this is the only option. Reset the running process stack and current instruction pointer location to build the call tree at the specified intervals. Then overlay them and get statistics on what's going on. I only know of one profiler that can do this on Windows. And it hasn't been updated in over ten years. It's Luke Stackwalker ! Someone please give Luke some love :)







Normally Luke would group the same functions, but I have no debug symbols, so I had to look at nearby addresses to look for common places. And what do we see? Not one, but two bottlenecks!



Down the rabbit hole



After borrowing from a friend of mine a perfectly legitimate copy of the standard disassembler (no, I really can't afford it ... I'll ever master the hydra ), I went to disassemble the GTA.







Looks completely wrong. Yes, most of the top games have built-in reverse engineering protection to keep them safe from pirates, cheats and modders. Not that it ever stopped them ...



Looks like some kind of obfuscation / encryption has been applied here, replacing most of the instructions with gibberish. Don't worry, you just need to reset the game's memory while it is doing the part we want to watch. Instructions must be deobfuscated before launch in one way or another. I had Process Dump nearby , so I took it, but there are many other tools for similar tasks.



Problem 1: is that ... strlen ?!



Further analysis of the dump revealed one of the addresses with a certain label strlen



that comes from somewhere! Going down the call stack, the previous address is marked as vscan_fn



, and after that the labels end up, although I'm pretty sure it is sscanf



.



Where can I do without a schedule



He parses something. But what? The logical parsing will take ages, so I decided to dump some samples from the running process using x64dbg . After a few steps of debugging, it turns out that this is ... JSON! It parses JSON. A whopping ten megabytes of JSON with 63,000 items .



...,
{
    "key": "WP_WCT_TINT_21_t2_v9_n2",
    "price": 45000,
    "statName": "CHAR_KIT_FM_PURCHASE20",
    "storageType": "BITFIELD",
    "bitShift": 7,
    "bitSize": 1,
    "category": ["CATEGORY_WEAPON_MOD"]
},
...
      
      





What is it? Judging by some of the links, this is the data for the "online trade directory". I assume it contains a list of all the possible items and upgrades that you can buy in GTA Online.



To clear up some confusion, I believe these are in-game money items that are not directly related to microtransactions .



10 megabytes? In principle, not so much. Although sscanf



not used in the most optimal way, but of course it is not so bad? Well ...







Yes, such a procedure will take some time ... To be honest, I had no idea that most implementations sscanf



call strlen



so I can't really blame the developer who wrote this. I would guess it was just scanning byte by byte and could stop at NULL



.



Problem 2: let's use a hash ... array?



It turns out that the second criminal is called right after the first. Even in the same construct if



, as you can see from this ugly decompilation:







All the labels are mine and I have no idea what the functions / parameters are actually called.



Second problem? Immediately after the element is parsed, it is stored in an array (or C ++ inline list? Not sure). Each entry looks something like this:



struct {
    uint64_t *hash;
    item_t   *item;
} entry;
      
      





And before saving? It checks the entire array by comparing the hash of each element, whether it is in the list or not. With 63 thousand entries, this is approximately (n^2+n)/2 = (63000^2+63000)/2 = 1984531500



, if I'm not mistaken in my calculations. And these are mostly useless checks. You have unique hashes, why not use a hash table.







During reverse engineering, I named it hashmap



, but it's obvious _hashmap



. And then it gets even more interesting. This hash-array-list is empty before loading the JSON. And all elements in JSON are unique! They don't even need to check if they are on the list or not! They even have a direct element insertion feature! Just use it! Seriously, guys, what the fuck !?



Proof of concept



All of this is great, but no one will take me seriously until I write the actual code to speed up loading to make a clickbait title for a post.



The plan is as follows. 1. Write .dll, 2. implement it in GTA, 3. hook some functions, 4. ???, 5. profit. Everything is extremely simple.



The problem with JSON is non-trivial, I can't really replace their parser. It seems more realistic to replace sscanf with one that does not depend on strlen. But there is an even easier way.



  • hook strlen

  • wait for a long line

  • "Cache" start and length

  • if another call comes within the range of the string, return the cached value


Something like this:



size_t strlen_cacher(char* str)
{
  static char* start;
  static char* end;
  size_t len;
  const size_t cap = 20000;

  //  ""     
  if (start && str >= start && str <= end) {
    // calculate the new strlen
    len = end - str;

    //    , 
    //        
    if (len < cap / 2)
      MH_DisableHook((LPVOID)strlen_addr);

    //  !
    return len;
  }

  //   
  //      JSON
  //   strlen   
  len = builtin_strlen(str);

  //     
  //     
  if (len > cap) {
    start = str;
    end = str + len;
  }

  // ,  
  return len;
}
      
      







As for the hash array problem, we just skip all checks entirely and insert the elements directly, since we know the values ​​are unique.



char __fastcall netcat_insert_dedupe_hooked(uint64_t catalog, uint64_t* key, uint64_t* item)
{
  //   
  uint64_t not_a_hashmap = catalog + 88;

  //  ,   ,   
  if (!(*(uint8_t(__fastcall**)(uint64_t*))(*item + 48))(item))
    return 0;

  //  
  netcat_insert_direct(not_a_hashmap, key, &item);

  //      
  //   .dll,   :)
  if (*key == 0x7FFFD6BE) {
    MH_DisableHook((LPVOID)netcat_insert_dedupe_addr);
    unload();
  }

  return 1;
}
      
      





The complete PoC source code is here .



results



So how does it work?



Previous load time online: about 6m
Time with patch check for duplicates: 4m 30s
Time with JSON parser: 2m 50s
Time with two patches together: 1m 50s

(6 * 60 - (1 * 60 + 50)) / (6 * 60) = 69.4% improvement in time (class!)


Yes, damn it, it worked! :))



This probably won't solve all boot problems - there may be other bottlenecks on different systems, but it's such a gaping hole that I have no idea how R * missed it over the years.



Summary



  • There is a single threaded bottleneck when launching GTA Online

  • Turns out GTA is struggling to parse a 1MB JSON file

  • The JSON parser itself is poorly done / naive and

  • After parsing, there is a slow procedure for removing duplicates


R * please correct



If the information somehow reaches Rockstar engineers, then the problem can be solved within a few hours by the efforts of one developer. Please guys do something about this: <



You can either go to the hash table to remove duplicates, or skip deduplication on startup entirely as a quick fix. For a JSON parser, just replace the library with a more performant one. I don't think there is an easier option.



ty <3



All Articles