AMD will be significantly overhauling its next-gen graphics architecture, right from the Compute Units to the process node. According to data mined from the latest Linux drivers, the RDNA 3 compute units will feature an unprecedented 128 cores per Compute Unit (CU) across four SIMDs. That is twice as much as existing RDNA/GCN designs, marking a significant shift in the Radeon scheduling/dispatch algorithm.
The SIMD32 and the Dual Compute Unit (DCU)/Work Group Processor (WGP) designs introduced with the 1st Gen RDNA core architecture will be retained. However, each CU and DCU/WGP will contain twice as many ALUs and SIMDs. We’re looking at x4 SIMDs per CU and x8 per WGP, and a total of 128 and 256 shaders per CU and WGP, respectively.
The primary philosophy of the RDNA graphics architecture was the ability to do more with less. This is why the Radeon RX 5700 XT was faster than the Radeon VII despite packing fewer cores. This was achieved by switching to a shorter wave 32 dispatch from a longer (but less efficient) 64 thread model. Overall, each CU was still 64 ALU wide but the SIMDs (the smallest unit of scheduling) were fattened up to 32 cores, up from 16 in GCN. The basic unit of scheduling was changed from the CU to the WGP/DCU, making up for the reduced thread count per dispatch.
With RDNA 3, we still have 32-core SIMDs and wave 32 scheduling, but there are twice as many both on the CU and WGP front. This should retain the efficiency of the RDNA core but allow for significantly higher core counts and higher per-core bandwidth by pairing each WGP with more cache, texture data, and registers.
According to Greymon55, the Radeon RX 7900 XT will have a boost clock exceeding the 3GHz mark. This isn’t surprising as AMD was already able to push 2.5GHz with the 6900 XT LC, so a 500MHz boost with a major node upgrade is all but expected.
With a 3GHz boost, the Radeon RX 7900 XT will have a single-precision compute throughput of 92 TFLOPs. This is if we consider the GPU to feature over 15K FP32 ALUs across two GCDs paired and a 256-bit bus. Here’s our previous coverage on the RX 7900 XT:
The Navi 31 GPU powering the Radeon RX 7900 XT will consist of seven dies: x2 Graphics Compute Dies or GCDs fabbed on TSMC’s 5nm process, x4 Memory Complex Dies or MCDs fabbed on the 6nm node, as well as the I/O die. According to MLID, the 5nm dies are internally referred to as the GCDs while the 6nm dies are called the MCDs. I’m not sure what will be 3D stacked upon what but I’d bet on the MCDs being placed on the GCDs.
It’s unclear whether AMD will go all out with the Infinity Cache and equip the Navi 31-powered RX 7900 XT with 512MB of L3 or a more reasonable 256 MB. The Radeon RX 6900 XT, in comparison, comes with just 128MB of IC. It’s highly probable that the LLC blocks will be paired with the memory controllers across multiple 6nm MCDs. At least, that’s what I expect. Finally, there’s the matter of the TGP. Unlike NVIDIA, AMD is still aiming for a very respectable 450W via two 8-pin or a single 12-pin power connector.
Navi 32 which will power the Radeon RX 7800 XT will also feature a chiplet design albeit with fewer dies. You can expect two GCDs at the very least with two to five MCDs. 3D stacking isn’t confirmed for this SKU but if the layout is the same as Navi 31, you can expect the MCDs to sit atop the GCDs. In terms of performance, picture a 50-70% uplift over the Radeon RX 6800 XT, and a 300-350W TGP.
Both the Radeon RX 7900 XT (Navi 31) and the RX 7800 XT (Navi 32) will feature x16 PCIe gen 5 lanes. The former will launch in the holiday season following the Navi 33-powered RX 7700 XT, and the latter is slated for an early 2023 release.