One of the core features of Windows 10 was the DirectX 12 API. And one of the more appealing aspects of the DX12 API is ray-tracing support, otherwise known as DXR. But that’s not all there is to it. DirectX 12 changes how games are developed and rendered in a multitude of ways.
Ever since games built atop the new API arrived, there’s been a lot of debate as to how it differs from DirectX 11 and whether it’s actually as big as the industry wants you to believe. In this post, we explore DirectX 12 and see what makes it such a step up from DirectX 11.
DirectX 11 vs DirectX 12: What Does it Mean for PC Gamers
There are three main advantages of the DirectX 12 API for PC gamers:
Better Scaling with Multi-Core CPUs (AMD Ryzen)
One of the core advantages of low-level APIs like DirectX 12 and Vulkan is improved CPU utilization. Traditionally with DirectX 9 and 11 based games, most games only used a single core for everything: PhysX, AI, draw-calls, etc. With DirectX 12 that has changed. The load is more evenly distributed across all cores, making multi-core CPUs more relevant for gamers. This is the reason why AMD’s Ryzen CPUs packing higher core counts perform much better in games built atop DX12 and Vulkan.
Maximum hardware utilization
Many of you might have noticed that in the beginning, AMD GPUs favored DirectX 12 titles more than rival NVIDIA parts. Why is that?
The reason is better utilization. Traditionally, NVIDIA has had been driver support while AMD hardware has always suffered from the lack thereof. DirectX 12 adds many technologies to improve utilization such as asynchronous compute which allows multiple stages of the pipeline to be executed simultaneously (read: Compute and Graphics). This makes poor driver support a less pressing concern.
Closer to Metal Support
Another major advantage of DirectX 12 is that developers have more control over how their game utilizes the hardware. Earlier this was more abstract and was mostly taken care of by the drivers and the API (although some engines like Frostbyte and Unreal provided low-level tools as well).
Now the task falls to the developers. This is a double-edged sword as there are multiple GPU architectures out in the wild and for indie devs, it’s impossible to optimize their game for all of them. Luckily, third-party engines like Unreal, CryEngine, and Unity do this for them and they only have to focus on core designing.
How DirectX 12 Improves Hardware Utilization
Again, there are three main API advances that facilitate this gain:
Pipeline State Objects
DirectX 11 represents the objects in the GPU pipeline across a wide range of states such as Vertex Shader, Hull Shader, Geometry shader, etc. This looks all well and good on paper, but when it comes to actual execution, it’s not efficient. Many times, these different state objects are inter-dependent. For example, some GPUs combine the pixel shader and output merger state into one.
However, in DX11 each of them needs to be set individually and the driver cannot move ahead until the entire state has been finalized. So, unless all the different states across a frame have been settled, the GPU can’t move ahead. This effectively leaves the hardware under-utilized resulting in an extra overhead and reduced draw calls.
DirectX 12 replaces the various states with Pipeline State Objects (PSO) which are finalized upon creation itself. These PSOs can be converted into any state as per requirement without depending on any other object or state. The PSOs can be dynamically switched to and fro from the registers by transferring a small amount of data.
With DirectX 11, there’s only a single queue going to the GPU. This leads to uneven distribution of load across various CPU cores, essentially crippling multi-threaded CPUs.
This is somewhat alleviated by deferred rendering but even then there’s only one stream of commands leading to the CPU at the final stage. DirectX 12 introduces a new model that uses command lists that are can be executed independently, increasing multi-threading. This includes dividing the workload into smaller commands requiring different resources, allowing simultaneous execution. This is how Asynchronous compute works by dividing the compute and graphics commands into separate queues and executing them concurrently.
In DirectX 11, resource binding was highly abstract and convenient but not the best in terms of hardware utilization. It left many of the hardware components unused or idle. Most game engines would use “view objects” to allocate resources and bind them to various shader stages of the GPU pipeline.
The objects would be bound to slots along the pipeline at draw time and the shaders would derive the required data from these slots. The drawback of this model is that when the game engine needs a different set of resources, the bindings are useless and must be re-allocated.
DirectX 12 replaces the resource views with descriptor heaps and tables. A descriptor is a small object that contains information about one resource. These are grouped together to form descriptor tables which in turn are stored in a heap.
Ideally, a descriptor table stores information about one range of types of resource while a heap contains all the tables required to render one or more frames. The GPU pipeline accesses this data by referencing the descriptor table index.
As the descriptor heap already contains the required descriptor data, in case a different set of resources is needed, the descriptor table is switched which is much more efficient than rebinding the resources from scratch.
Other features that come with the DirectX 12 are:
DirectX Raytracing (DXR): This is essentially the API support for real-time ray-tracing that NVIDIA so lovingly calls RTX.
Adaptive shading: Adaptive shading allows the GPU to focus on areas of the screen that are more “visible” and affected per frame. In a shooter, this would be the space around the cross-hair. In contrast, the region around the border of the screen is mostly out of focus and can be ignored (to some degree).
Multi-GPU Support: DirectX 12 has support for two types of multi-GPU support, namely implicit and explicit. Implicit is essentially SLI/XFX and leaves the job to the vendor driver. Explicit is more interesting and lets the game engine control how the two GPUs function in parallel. This allows for better scaling and mixing and matching different GPUs even ones from different vendors (including your dGPU and iGPU).
Another major advantage is that the VRAM images of the two GPUs aren’t mirrored and can be stacked to double the video memory. This and a ton of other features make DirectX 12 a major upgrade to the software side of PC gaming. It’ll take a while to leverage all these features but some are already apparent.
Some DirectX 12 titles like Ashes and Sniper Elite 4 achieve excellent multi-GPU scaling. Likewise, a lot of older AMD GPUs see a healthy boost in async compute enabled titles. The gains in the case of GeForce cards are relatively smaller as they already utilized most of the resources quite well thanks to excellent drivers.