This is somewhat alleviated by using a deferred context, but even then, ultimately, there’s only one stream of commands leading to the CPU at the final stage. DirectX 12 introduces a new model that uses command lists that can be executed independently, increasing multi-threading. This includes dividing the workload into smaller commands requiring different resources, allowing simultaneous execution. This is how Asynchronous Compute works by dividing the compute and graphics commands into separate queues and executing them concurrently.
In DirectX 11, resource binding was highly abstract and convenient but not the best in terms of hardware utilization. It left many of the hardware components unused or idle. Most game engines would use “view objects” to allocate resources and bind them to various shader stages of the GPU pipeline.
The objects would be bound to slots along the pipeline at draw time, and the shaders would derive the required data from these slots. The drawback of this model is that when the game engine needs a different set of resources, the bindings are useless and must be reallocated.
DirectX 12 replaces the resource views with descriptor heaps and tables. A descriptor is a small object that contains information about one resource. These are grouped together to form descriptor tables that are stored in a heap.
Ideally, a descriptor table stores information about one type of resource, while a heap contains all the tables required to render one or more frames. The GPU pipeline accesses this data by referencing the descriptor table index.
As the descriptor heap already contains the required descriptor data, in case a different set of resources is needed, the descriptor table is switched, which is much more efficient than rebinding the resources from scratch.
Other features that come with the DirectX 12 are:
DirectX Raytracing (DXR): This is essentially the API support for real-time ray-tracing that NVIDIA so lovingly calls RTX.
Variable Rate Shading: Variable Rate Shading allows the GPU to focus on areas of the screen that are more “visible” and affected per frame. In a shooter, this would be the space around the cross-hair. In contrast, the region near the screen’s border is mostly out of focus and can be ignored (to some degree).
It allows the developers to focus more on the areas that actually affect the apparent visual quality (the center of the frame in most cases) while reducing the shading in the peripheries.
VRS is of two types: Content Adaptive Shading and Motion Adaptive Shading:
CAS allows individual shading of each of the 16×16 screen tiles (tiled rendering), allowing the GPU to increase the shading rate in regions that stand out while reducing them in the rest.
Motion adaptive shading is as it sounds. It increases the shading rate of objects that are in motion (changing every frame) while reducing that of relatively static objects. In the case of a racing game, the car will get increased shading while the sky and off-road regions will be given reduced priority.
Multi-GPU Support: DirectX 12 has support for two types of multi-GPU support, namely implicit and explicit. Implicit is essentially SLI/XFX and leaves the job to the vendor driver. Explicit is more interesting and lets the game engine control how the two GPUs function in parallel. This allows for better scaling and mixing and matching different GPUs, even ones from different vendors (including your dGPU and iGPU).
Another major advantage is that the VRAM images of the two GPUs aren’t mirrored and can be stacked to double the video memory. This and many other features make DirectX 12 a major upgrade to the software side of PC gaming.
DirectX 12 Ultimate: How is it Different from DirectX 12?
DirectX 12 Ultimate is an incremental upgrade over the existing DirectX 12 (tier 1.1). Its core advantage is cross-platform support: Both the next-gen Xbox Series X and the latest PC games will leverage it. This not only simplifies cross-platform porting but also makes it easier for developers to optimize their games for the latest hardware.
By the time the Xbox Series X arrived last year, game developers had already had enough time with hardware using the same graphics API (NVIDIA’s Turing), simplifying the porting and optimization process. At the same time, this also allows for better utilization of the latest PC hardware, thereby improving performance. All in all, it’s another step by Microsoft to unify the Xbox and PC gaming platforms.
It also introduces DirectX Raytracing 1.1, Sampler Feedback, Mesh Shaders, and Variable Rate Shading. The last two were already supported by NVIDIA’s RTX Turing GPUs (and are explained above), but now they will be widely adopted in newer games and developers.
Texture Sampler Feedback: TSF is something MS is really stressing about. Simply put, it keeps track of the textures (MIP Maps) that are displayed in the game and which are not. Consequently, the unused ones are evicted from the memory, resulting in a net benefit of up to 2.5x to the overall VRAM usage. In the above image, you can see that on the right (without TSF), the entire texture resources for the globe are loaded into the memory. With TSF on the left, only the part that’s actually visible on the screen is kept while the unused bits are removed, thereby saving valuable memory.
This can be done across frames as well (temporally). In a relatively static image, objects in the distance can reuse shading over multiple frames, for example, over each two to four frames and even more. The graphics performance saved can be used to increase the quality of nearby objects or places that have a more apparent impact on quality.
DXR 1.1 is a minor upgrade over the existing 1.0 version:
- Raytracing is now fully GPU controlled and doesn’t require drawing calls from the CPU, reducing the CPU overhead and improving performance.
- New raytracing shaders can be loaded as and when needed, depending upon the player’s location in the game world.
- Inline raytracing is one of the core additions to DirectX 12 Ultimate. It gives developers more control over the raytracing process. It’s available at any stage of the rendering pipeline and is feasible in cases where the shading complexity is minimal.