Unicorn Developer

Posted on Jan 13

What's C++ like in gamedev?

#cpp #gamedev #programming

We invite you to read an article on how C++ is used in modern game development and why the industry is still not ready to move away from it. The author explores how C++ works at different levels of game engines and how performance requirements, legacy code, and platform constraints make an impact on the industry.

We published and translated this article with the copyright holder's permission. The author is Sergey Kushnirenko.

I intended to write a follow-up to the article "Useful reading for a game developer" about using C++ in game engines, but my thoughts wandered off in a different direction.

With the recent evolution of C++, its newer standards (C++20/23) will likely reach game development only after a significant delay—around five years, right with the next console generation, if they are adopted at all. C++ in gamedev is now stuck somewhere between the 14 and 17 standards: Sony has only just rolled out its compiler version with full C++17 support, and considering how slowly game studios react to changing core pipelines, they will only adopt something new in new projects. Changing the horse, (the compiler) in the middle of game development is like shooting not only yourself in the foot, but your teammates' as well: if it works, don't fix it.

"If changing the compiler and standard doesn't guarantee a performance boost of more than 5%, then I won't approve the budget or people." (c)

The codebase of large engines gives us an understanding of the amount of code in production and tools. As they say in the industry, such large code bases have become "too big to fail." So writing something new on par with engines like Unity/Unreal/Dagor in another language is impossible even if it were a way safer and faster. Though developers still attempt to come up with new engines. The longer we support existing C++ projects, the fewer choices we have left.

All attempts to add scripts, a second language virtual machine, visual script editors, and blueprints only show how cumbersome the core mechanism has become. Games are sold perfectly well on the current technology stack. So justifying a migration to a new stack with mythical refactoring, tech debt, and new technologies fails. Thus, the mice keep crying and munching on their C++ cactus.

The existing codebase for game editors and engines isn't the only reason for this situation. Here are a few more reasons why studios can't choose something else.

Platform vendors (Sony, Microsoft, Nintendo) provide APIs in C/C++. The size of their OS and SDK codebases is much larger than that of game engines. Using anything alternative simply won't work—the cost of reworking would bury even Nintendo with its unlimited budgets.
Porting games between platforms is only possible in C/C++ languages. I wrote the reason above—there is no other common language between platforms.
C++ compilers have been optimized for decades. To achieve comparable performance for another language, it would have to go through the same path. This process requires if not decades—as we already have the foundation—but definitely years. Writing relatively high-level, fast platform code in anything other than C/C++ is simply not feasible right now.
Legacy is inherited code. We can't escape it; we have to maintain it, edit it, and fix bugs. We also need to figure our what this code does. Sometimes it's easier to rewrite a certain part from scratch, but we don't always have the people or time for it.
Language pain accurately describes why the industry won't get off C++ for at least the next ten years. Vendor lock is justified not only by using hardware from a specific manufacturer but also by the programming paradigm of the chosen language. Each manufacturer has its own one. No one will give up even 1% of the market, and having our own programming language only increases the vendor presence. Considering that losing 1% means tens of billions in lost profits, the cost of developing such a language, even at 10% of that profit, is more than justified.

Shaders. I'm putting these languages in a separate category, although they are very close to C. They are part of the platform, and we can't make a game without them. While C++ is a sort of a "Philosopher's Stone," that can transform general ideas into working code for any platform, there is no such common component for low-level high-performance code. Most likely, there never will be. Well, we simply won't be able to render anything on screen. For some time, OpenGL took up some space, but through collective efforts, it has been almost eradicated everywhere.

Most interestingly, the main language for game engine development has become heterogeneous—we can divide it into low-level, mid-level, and high-level C++, each with its own features.

Hardware/Baremetal/Hardcore C++

It's used for number crunchers and working with large amounts of computations.

This code example is even not the toughest:

void frustum_for_box_occluder(
                         const TMatrix &to_box_space,
                         const Point3 box_corners[8],
                         const Point3 &eye,
                         plane3f out_frustum_planes[BOX_OCCLUDER_PLANES_MAX],
                         int *out_planes_count)
{
  Point3 box_eye = to_box_space * eye;

  G_ASSERT(to_box_space.det() > 0);

  unsigned index = unit_segment_classify(box_eye.x) * 1
                 + unit_segment_classify(box_eye.y) * 3
                 + unit_segment_classify(box_eye.z) * 9;
  G_ASSERT(index < 27);

  {
    // Rare case near_box, when the point is located very close to the cube.
    // Then the plane is chosen based on the closest face to the eye.
    bool near_box = likely_inside_m0505(box_eye.x)
                 && likely_inside_m0505(box_eye.y)
                 && likely_inside_m0505(box_eye.z);

    if (near_box)
    {
      float abs_x = fabsf(box_eye.x), 
            abs_y = fabsf(box_eye.y), 
            abs_z = fabsf(box_eye.z);

      int i0 = abs_x < abs_y, i1 = abs_y < abs_z, i2 = abs_z < abs_x;

      float max_coord = box_eye[gComparisonsToMaxCoordIndex[i0][i1][i2]];
      const BoxPointClassificationForOcclusion &cl =
        gBoxPointClassificationForOcclusion[
              gNearCubeFrontPlaneForOcclusion[i0][i1][i2][max_coord < 0]];

      *out_planes_count = 1;
      Plane3 p(box_corners[cl.mFrontPlane[0]], 
               box_corners[cl.mFrontPlane[1]], 
               box_corners[cl.mFrontPlane[2]]);

      out_frustum_planes[0] = v_ldu(&p.n.x);
      return;
    }
  }

  {
    // Common case. Planes are constructed based on index, 
       obtained from unit_segment_classify for x,y,z.
    const BoxPointClassificationForOcclusion &cl =
          gBoxPointClassificationForOcclusion[index];

    *out_planes_count = cl.mSidePlanesCount + 1;
    Plane3 p(box_corners[cl.mFrontPlane[0]],
             box_corners[cl.mFrontPlane[1]],
             box_corners[cl.mFrontPlane[2]]);
    out_frustum_planes[0] = v_ldu(&p.n.x);
    for (int i = 0; i < cl.mSidePlanesCount; ++i)
    {
      Plane3 p_(Plane3(eye, box_corners[cl.mSidePlanes[i][0]],
                            box_corners[cl.mSidePlanes[i][1]]));
      out_frustum_planes[i + 1] = v_ldu(&p_.n.x);
    }
  }
}

Here are some examples of "this kind of C++": physics simulation subsystems, scene rendering, collisions, load balancing systems (Tasks/Workers) when used in multi-core systems, character animation, water and particle calculations (https://github.com/NVIDIA-Omniverse/PhysX).

This kind of C++ is also of help when it comes to handling platform (hardware) specifics and operating with concepts like cache locality, branch prediction, data packing, and structure layout. The code of these systems looks like it's written in pure C, with minimal use of C++ features like function overloading or inheritance. That is, even regular C++ speed isn't enough here, and we have to significantly limit capabilities to squeeze out another percent of performance.

Everything that can execute in-place is inlined, even if it repeats thousands of times and could be moved to a function call—minimal function calls, lots of wrappers to reduce branching. It's very inconvenient: with the same syntax, the code is twisted so much that not every developer can grasp it, let alone read it. But of course, it's written in C++.

Letting a less experienced programmer work on this code is a bad idea. This isn't a job for a regular middle developer, or even most seniors. To work here, we need more than just understanding—we need to know specific tools used inside and how long the author has been working on this system.

In one of the GDC talks on Uncharted, the developers presented tests showing that the game spends 80% of its time in such code and only 20%—in general code. This low-level code is tens times faster than regular code. If architecture and some rules of writing perfect code hinder speed, then both the architecture and the rules can go to hell... Let me rephrase the expression about the capitalist and 300% profit: a rendering developer will simply break half of your editor for a 3% performance gain, and that will be your problem, not theirs.

Such low-level, not-quite-C++ code is imperfect, inconvenient, riddled with every possible anti-pattern, walks the line of UB, and well-seasoned with personal tricks of individuals. But it's fast, and that's enough to put it in production. It remains highly questionable whether any language aspiring to be a "better C" can actually generate faster code. Because of niceties, syntactic sugar, checks, and restrictions, such code loses up to half its performance. Want to shoot yourself in the foot at the machine gun speed? Be my guest. Oh, I forgot one more thing: this code will most likely compile and run on another platform.

This is a bad example, don't do this (I learned about it from colleagues' stories)

In one of the engines, texture streaming was a bit leaking, the gameplay could last for 2-3 hours. Fixing it seemed impossible because this was legacy code, and attempts to repair it led to stuttering during the game. In the end, they fixed it like this: when the game approached the OOM boundary, the save file's creation date would change to 2039, which made Steam consider it an error and show a system message. Later developers fixed it properly. Users, of course, were unhappy but blamed it on network issues, Steam, or their PCs, but not the game.

Another reason for using "this kind of C++" is that it allows for control over the resulting code performance where needed, as we can roughly imagine what constructs will compile in ASM.

Middleware/Common C++/Templates

Moving up the architecture layers, we reach the level of "regular" C++. This code uses classic features and algorithms invented during the language development. About 80% of the code involved in the software locates here. Hundreds of libraries in different languages provide access to their capabilities via the "C interface"—various interfaces of the OS core language, for example, Java JNI, Objective C++, virtual machines for scripting languages.

Here, the language reveals itself as a high-level design tool—not a language for writing code, but a tool for describing application architecture (OOD, DOD, DDD). It allows both squeezing all the juice out of the hardware, disregarding all the rules of decent code, but also demonstrating high quality code, resistant to errors, leaks, bound check access, and protected from juniors. Unfortunately, in many game engines, remnants of those "roaring" 2000s still linger there, when C++ was used extensively for writing game logic. You can notice this, for example, in the available source code of Unreal or Dagor, where core logic related to the player is partially present at a very low-level of objects.

And, of course, the language provides access to library APIs. Using some hacks like privablic access, we access most of the functionality hidden from the end user. If you think this is the real C++, you're wrong. The ghosts of "plain C" still live: here and there, we can see deliberately simplified functionality so that as many people as possible can use this layer.

The chart above shows the approximate computational performance depending on the technology level used. With the regular C++ we use less than 10% of the hardware capabilities. So, it's no surprise when developers are willing to trade productivity in man-hours for performance.

"We would happily sacrifice 10% productivity to get 10% additional performance."

Tim Sweeney (c)

Figure 1 — A reminder of how the quotation author looks like

As a result, virtual machines for second and third-level languages appear in the engine. They enable writing fast algorithms at the engine level and shielding designers from C++ in favor of something slower, more convenient, and understandable. First, devs would drag in scripting languages like Lua/Js/Squirrel/"Write your own". A bit later, visual programming arrived. Scripts and visual scripts (blueprints) are also not an invention of gamedev. They came from the world of robotics, where the cost of an error is much higher—any error can lead to equipment damage, let alone just a crash to desktop. The downside of this approach: what we can write in 10 lines of code will take 1000 lines due to writing wrappers, checks, tools, and so on.

No need to mention the performance drop—even the most advanced Lua VM (no matter what its developers claim) typically degrades performance by at least half. Perhaps on some synthetic tests, the performance drop is ten percent or less, but in a real game, the code from such a test executes 0.1% of the time. It's not as critical as it seems because it's compensated by the growth in memory, processor, and graphics card speeds. The performance drop isn't just measured in teraflops; the Lua language itself is much simpler than C++. Developers and designers also start thinking and writing within the paradigm of a simplified language, as they don't need to write more complex code, and sometimes they can't.

In my experience, code rewritten from scripting languages back to C++ will be 5+ times faster. This usually happens when profiling identifies slow sections of the game. Other scripting languages aren't far ahead of Lua, which has been the focus of the development attention for at least ten years. During that time, it has been significantly optimized. Since the language appeared in 1993, the performance of the virtual machine, independent of hardware performance, has grown almost tenfold. The chart below shows benchmarks of algorithm implementations between different versions of Lua virtual machines; for reference, the red line shows the benchmark time for the same algorithm in C.

The need to create bindings from C++ to a scripting language is another bottleneck when using C++ <-> scripts, as we have to copy data between representation levels. Performance loss is allowed to enable everyone—from artists to AI designers and systems mechanics—to program, make mistakes and write complete nonsense, without crashing the editor with their mischievous hands.

Of course, the main benefit that makes game engine developers accept significant slowdown is the possibility to hot reload game logic. This doesn't come out of the box. Moreover, it requires reworking half of the existing code, but it allows speeding up game development dozens of times. Judge for yourself: editing code in the IDE, compilation, restarting the level, creating the game situation for working—all this takes minutes of real time. In turn, script hot reload takes seconds, while a developer and designer don't lose the context of the game situation.

Unity and Unreal have gone even further in this regard, providing capabilities for visual scripting and editing objects and logic directly during simulation, which reduces requirements for basic development knowledge and programming. This is probably how games should be developed—when we simply change the game state right during the game. Just as with the transition from native code to scripts, and from scripts to visual programming—this further slows down the overall game code but provides even more protection against errors for the team. Now, scripts and VMs act as the lower-level framework. As for the visual scripting level, we are 95% protected from crashing the game, while still having access to all engine functionality—from shaders to animations and NPC behavior.

However, this doesn't guarantee that development will be easier. I'd say the opposite—development becomes more complex, but this complexity is spread across hundreds and thousands of game elements. Of course, we can mess up worse and much faster than in code. This horror is from a real project, let's call this complexity WTF/s(1). Frankly, no one will review this—they'll approve it without looking, just pray that this game designer brings their monster to release.

Figure 2 — WTF/s (n)

Figure 3 — WTF/s (n^2)

Figure 4 — Don't do this! WTF/s (80lvl)

Meta/Highlevel C++

Now we approach the core. Besides ordinary C++ code, there are small parts of a game engine that require most advanced language features—RTTI, reflection, compile-time calculations, and code generation tools, where game code grows from a set of configs according to given rules.

For obvious reasons, RTTI is disabled in 99% of cases, but the need to cast to the required type remains, so almost everyone writes their own little system.

As there is no reflection in the language, every second studio "invents" it as best they can. There is no ready-made, proven reflection scheme or technology—each framework offers its own methods for annotating code, serialization, and bindings.

The code and types are generated from configs, so that both scripts can process them and the engine-game have access to these types. Usually, this task is solved with macros, templates, and black magic, which ultimately results in quite non-trivial code or even a separate virtual machine with its own language.

Among the known "decent" code generators, I can highlight the following ones.

A data schema in a separate portable language (flatbuffers).
A separate language to generate data and code to work with (Racket from Naughty Dogs) https://www.gdcvault.com/play/211/Adventures-in-Data-Compilation-and https://www.youtube.com/watch?v=oSmqbnhHp1c.
CppHeaderParser is a single-file Python library that can read headers. It's very simple, doesn't follow #include, skips macros, works very quickly, and allows easy integration into the pipeline.
RTTR allows creating and modifying types, classes, methods, and object properties in C++ during runtime. This can be useful for various purposes, such as serialization, scripting, generating user interfaces, and more.

Afterthoughts...

After watching examples from the new language standards on YouTube or CppCon (where a lambda wrapped in memfunction glides over coroutines) we return to the real world. Again, after a sleepless night, staring at the debugger and my notes, I discover some strange line of code that makes me wonder how any of this code worked at all. For the hundredth time, I think that if someone wrote this in C++11, then how intricately they could do it the new way. And how long it will take to find that bug. Games are written for a purpose, so simply rewriting code back and forth for the sake of refactoring is a bad idea. Maybe it's good that we live in our own little C++ world guarded by the holy trinity of Sony, Microsoft, and Nintendo, which don't let the dragons from the committee in here?

Want to learn more?

The PVS-Studio team values the game developer community and doesn't miss an opportunity to talk more about how to improve workflows using static code analyzers. Useful resourses:

DEV Community