Learning DirectX 12 - Part 2 Initialization Theory

#programming #gamedev #cpp #graphics

WTF is Direct3D ?
Direct3D is essentially just a middleman between our code to a GPU. Its an interface in which to interact with the GPUs. Each GPU as I'm sure you know is different, so each GPU has its own way of doing things. This is where Directx comes in and essentially acts as an universal translator from our code (in Directx12 c++) to machine code for the GPU to execute. This ofcourse means that Nvidia, intel and AMD (manufacturers of GPUs) work hand in hand with Microsoft (who make DirectX) in order for these things to work the way they should and to have somewhat of an universal workings of things.

Component Object Model:
This Component Object Model often shortened to COM is what allows DirectX to be programming language independent. The COM interface is hidden to use when we write code, we usally don't interact or change stuff in it directly. We manipulate COM through pointers and references to the object. We DONT create new COM objects via the c++ new keywords. If you have ever taken a C++ programming course this is pretty basic stuff as your teacher might have said (as mine did) "Don't touch code unless you have to" but with a strong eastern european accent. We access COM objects via references and in order to free the memory that take up the space from this we call a Release method (Important to note that all COM interfaces inherit from an univeral IUnknown COM interface, which provides this release method). When a COM object isn't referenced anywhere their memory is freed up (Although this is done manually)

Thankfully we have Windows Runtime Library (WRL) to help with this pointer to the COM class, as this essentially works as a smart pointer (Thank god) so that the memory gets released automatically when the reference goes out of scope (Oh how I love you smart pointer).

Small note:
I won't cover code right now in this part I will do that in a later part so right now I'm just trying to get you (and me) familiar with these concepts enough so that when we introduce the code, we will have an overall understanding of it. I should mention that I will repeat things like crazy in this series, such is just the consequence of writing this devlog because how the fuck am I supposed to keep track, so if you don't get a concept or don't understand it, That's probably because I don't really understand it and haven't explained it well, So we will return to that later!

Texture Formats
2D image mapped onto a 3D object to give a texture, you know you can have a cube but in order to make a wooden box we gotta give a wood texture. Think of a grass block in minecraft

each side has a 2d image projected on its square face. Now you can do loads of fun and more technical stuff with textures but we will cover that later!

The Swap Chain
When rendering its best to draw the future first and then present the image, the current image is whatever last was drawn. Think of it this way, we have a front and back buffer. The front buffer is whats currently on the screen, you have your minecraft pig beautifully drawn on that front buffer and you see it. On the back buffer we have drawn a Minecraft Creeper, boo! Now when we swap these two frames, the back buffer for the front buffer. The back buffer becomes the front buffer and the front buffer becomes the back buffer, but now what we see is the Minecraft creeper! And the next frame we want to draw a Minecraft cow so we now render that cow onto the back buffer and swap it with the front buffer to display the cow!

This is called double buffering when you have a front and back but triple buffering exists aswell but yknow just cuz it exists doesn't mean I have to cover it, but I think you can figure it out.

Depth Buffering:
Depth Buffering is a fairly fun concept in DirectX12, essentially every pixel on our screen has a value between 0.0f and 1.0f. 0 denotes the closest an object can be and 1 denotes the farthest away an object can be. Why is this relevant? Because we have to keep track of what to draw in front of what. Lets say we have a box in front of another box, the pixels that are of the box infront of the other box are the ones that should get prioritzed to get drawn, not the backbox, those shouldn't get drawn. So depth buffering is a way to essentially understand what pixel should be drawn depending on an pixels depth. Important to note that thanks to this it does not matter what order we choose to draw stuff in DirectX12 aslong as we have properly working Depth Buffering. A depth buffer is texture, important to note so it must be created with certain data formats in DirectX12.

Resources and Descriptors:
If you read my LearningOpenGL series I covered what a StateMachine is. Essentilly what it is, is that you have to be very explicit with the API. Imagine youre trying to describe a painting to someone and your goal is to get that person to paint the painting you have in mind. So you say like, Pick up the pencil -> make that pencil red ->paint on the paper from here to here. As you can see these instructions are essentially different states that the person finds themselves in. Directx12 is very similar to this, you have to tell it explicitly and methodically each step everytime you want to do something. Oh so you want to paint with a blue pencil now? Well time to go through each one of these steps again. I overexagerate but at the same time I don't. These specifications and heavy way of handholding you have to do is what makes DirectX12 so performance optimizable, absolute control.

During rendering the GPU writes to resources, back buffer, front buffer and depth buffer for example. Reads from resources that store things such as textures, 3D positions etc. Whenever we draw something we might need to change the resource in which we draw from. Resources are not referenced directly, they are referenced from descriptor objects. The reason for this is so that the descriptor object can easily translate to the GPU for us what the actual fuck we are going to use this resource for. It is essentially an abstraction layer to make our lives easier (Thank god). This also means sometimes we don't have to bind the ENTIRE resource just a section of it so we do not waste computer power.

Descriptors have types, these are allegedly the types we will use.

CBV/SRV/UAV describe constant buffers, shaders and stuff like that.
Sampler descriptors used for textures I guess
RTV descriptor describe render target resources, whatever that means
DSV descriptor describe our depth and stencil resources (Hey I know that)

These 4 are what Frank D. Luna bring up in his book here, honestly I do not really understand this at this point exactly these different types but I am sure I will understand better as we delve deeper into the code in the future.

DirectX Graphics Infrastructure:
DirectX Graphics infrasctructure is essentially an API you use along Direct3D. It's purpose is to handle more universal aspects of rendering. In 2D rendering you also need a swap chain for example to get smooth animation etc. For our purposes its worth noting that DXGI is what handles display adapters (Graphics cards). A graphics card can have multiple display outputs, so in other words. This is all a way to assign a graphics card or so to what screen it should render to, we typically don't actually assign manually the displays (thank god) but DIrectX Graphics Infrastructure is useful for outputting what our hardware is. A system can also have a software display adapter that essentially works as an emulator for graphics functionality (In this way we can fake the rendering capabilities over to the CPU).

This output of our hardware that we have access for is useful so we can properly assign important values for example refresh rate and resolution. These have to perfectly fit, so why not use this universal stable way of gathering this output.

Residency:
Obviously its completely pointless to have resources loaded to the GPU that arent needed for the current scene. As the books example is, the GPU doesn't need the Cave resources if the player finds themself in a deep forest. Therefor its important to make memory resident and evict memory. This is typically done in cases of like level changes and or drastic huge landscape shifts (There is overhead to constantly change out resources like this so you only do it during select periods).

Heap vs Stack:
The heap is just a dedicated chunk of memory, in C++ things created on the heap are typically done with the new keyowrd for example. This variable that is created with this new keyword lives past when the variable goes out of scope. In other words if you create for example a player character like
Player* ptr_Player = new Player("Guyman"); Then that piece if memory will always be there, it will not get destroyed until we deliberately delete that player.

Creating things on the stack essentially means we are creating it and assigning it in memory and that memory gets freed up automatically when the variable for example goes out of scope.

Resources:
Buffers: Are arrays of data elements that are of a certain type. I like to think vertices here, a list of vertices that make up our beautiful shrek model we are loading.

Textures: Image data, storing a grid of image data is how Frank D Luna describes Textures right now.

Commited Resources:
A single API call creates the heap at the same time as you create a resource. In this way these both are bound together, simple right? Well this is quite expensive and ideally shouldn't be done whilly nilly. This is useful for things that need to be created during initialization and not per frame.

Placed Resources:
Create a large heap, then assign the resources into that heap. It is a way of seperating allocating memory and creating resources in memory. Placed resources are much faster to change directly per frame so this is great for things such as resource creating/destruction.
Think of it this way, it would be really annoying and time consuming to have to create a new flag everytime you want to wave it around so instead someone tells you the location of one flag you created previously. Now everytime you want to wave a flag, you just go and get your flag.

Reserved Resources:
Imagine you have a huge world map, it would be pretty stupid to have the entire world map loaded up in memory if the user is say zoomed into just Stockholm, Sweden. So just load the parts neccessary needed for the current operation. Reserved resources are essentially just chopped up big resources whose pieces are served depending on if they are needed or not.

CPU / GPU Interaction
Both the CPU and the GPU do a lot of work in graphics. They do different things however they work together to render our beautifully disguised triangles. The GPU has a command queue, the CPU enters commands to the queue through the Direct3D API, this command is added to a stack and or queue and it isn't until the GPU has finished commands before the command reaches the front, in which the command is executed. If the command queue is empty, The GPU will idle. This ofcourse is not ideal as we ideally want our GPU and CPU to always be working on something to get the most performance we can out of software.

CPU / GPU synchronization
Given that the CPU and GPU work together asynchronously then the GPU and CPU can't do stuff that would later be contridicted by the other. Therefore a way to solve this is to force the CPU to pause until the GPU has finished all its stuff. This is ccalled a fence point that you specify in code.

Resource Transitions:
Lets say the GPU is working on a resource, this resource is probably not safe to do a lot of operations on as things are being changed about it. Therefore marking the state of that resource to a sort of state as "being worked on" would help a lot in order to avoid resource hazards. This is essentially what Resource Transitions are for, a way to mark states for resources in order for the API to alleviate problems that would occur if something wants to poke at what the GPU is currently working on. Now this state isn't just a declaration of "being worked on or not" but more a descriptor of how the resource is being manipulated right now by the GPU.

Final Words:
We are getting dangerously close here to my brain exploding from all this information but thankfully, I think it's time to begin writing some code. In the next part we wil finally cover the code needed to initalize Direct3D (Finally!).

DEV Community

Learning DirectX 12 - Part 2 Initialization Theory

Top comments (0)