AMD Radeon RX 7900 XTX Packs 6,144 Cores (6900 XT has 5,120); AI Cores Not the Same as NVIDIA's Tensor Units

Like earlier generations, the rumor mill has spoiled another Radeon lineup courtesy of unrealistic performance expectations. As per the official figures laid out by AMD, the RX 7900 XTX is just 50-60% faster than the 6950 XT in ray-traced workloads, a rather disappointing upgrade considering the latter’s RT grunt. And these are the first-party benchmarks. The third-party numbers are bound to be worse.

In addition, the Radeon RX 7900 XTX features only 6,144 stream processors, a mild increment over the 5,120 found on the RX 6900 XT. All the rumors before the reveal had claimed a sheer count of twice as much. While the next-gen Compute Units are double pumped or composed of “dual-issue” SIMDs, they won’t be as effective as physical cores. This is similar to the Ada SM, where the FP32 throughput is twice as much as INT32. Ergo, the RDNA 3 CUs may be twice as fast in specific workloads; the average improvement will be lower.

With the RDNA architecture, AMD introduced the Work Group Processor (WGP) as a combination of two Compute Units for better use of resources and improved multi-threading. The latest microarchitecture undoes this change, leaving the additional resources of the WGP for a single CU with a 2x issue rate.

Additionally, the Dedicated AI Accelerators added to RDNA 3 aren’t quite as capable as NVIDIA’s Tensor cores. The fixed function accelerators on the latter are much wider, suited for matrix multiplication and super-fast mixed precision compute. Meanwhile, AMD’s new design doubles BF16 execution by only 2x over RDNA 2. NVIDIA’s Tensor cores have been doing that for a while, with each successive generation improving it further. Irrespective, this should be sufficient to boost neural net-based upscalers using DP4a like XeSS.