NVIDIA's RTX 40 Series RT Features Require Per Game Integration, DLSS 2->3 a Simple .dll File Replacement?

We recently spoke to NVIDIA’s technical marketing team about the advanced features of the next-gen Ada Lovelace architecture powering the RTX 4090 and got a few confirmations. For starters, it looks like it’ll be a while before the primary upgrades of the 3rd Gen RT Core come into play. One of the primary features of this fixed-function hardware core include support for advanced micro-maps, namely opacity and displacement micromaps.

The Opacity Micromap Engine on the 3rd Gen RT Cores optimizes BVH traversal and intersection (the first step in ray-tracing) by simplifying the evaluation of transparent textures/meshes. It leverages an opacity micromap, a virtual mesh of micro-triangles, each with an opacity state used to resolve ray intersections with non-opaque triangles.

Related Articles

AMD Ryzen 9000 Mobile CPUs Faster than Intel’s i9-14900K in Single and Multi-Core Benchmarks [Update]
April 16, 2024

Intel 15th Gen Arrow Lake iGPU Leaks Out: Slightly Faster than Meteor Lake
April 15, 2024

Like the Opacity Micromaps, displaced micro-meshes are another way to optimize BVH structures and accelerate traversal for faster ray-tracing capabilities. Where the former is instrumental with highly detailed and/or transparent textures, the latter helps boost performance with complex geometry and high poly meshes without losing detail.
Read more about the Ada Lovelace graphics architecture here.

As you may have surmised, these micromaps must be generated explicitly in every game using extensions to the DXR and Vulkan RT APIs. The in-game assets, primarily meshes and alpha masks, need to be converted to the supported format using specific tools that NVIDIA will soon provide to developers.

We also asked about Shader Execution Redordering (SER), which allegedly boosts performance by up to 25% in advanced ray-tracing workloads. This is achieved by ordering warps so that threads sharing the same resources are placed together, eliminating holdups that (divergent thread) stalls might cause. Like opacity and displacement micromaps, this, too, needs per-game implementation on behalf of the developer. And like them, it will only work on the next-gen RTX 40 series GPUs.

Lastly, you’ve got DLSS 3. The silver bullet that might win this round for NVIDIA (again). We already know that the DLSS 3 upgrade is relatively easy using the Streamline plugin. While we don’t know this for sure, NVIDIA hinted that a simple .dll file replacement and some edits to the graphics menu are the only two steps in enabling it in a game already supporting DLSS 2. You can go through the interesting bits below:

Do the displacement and opacity maps require engine-level integration on behalf of the developer, like SER?
Yes, these features must be explicitly integrated into the game engine using extensions to DXR and Vulkan RT, and assets must be converted using specific tools for that purpose.
Is SER hardware dependent? If not, then will it be supported across existing RTX lineups?
We build SER specifically for Ada.
Is the DLSS 2->3 upgrade a simple .dll file replacement or more complex? If the latter, can you please elaborate?
DLSS 3 leverages the same integration points as DLSS 2 (color buffer, depth buffer, engine motion vectors, and output buffers) and NVIDIA Reflex, making upgrades from these existing SDKs easy via our DLSS 3 Streamline plugin.
NVIDIA