AMD’s Radeon RX 7900 XT is slated to land by the end of the year. Featuring a chiplet design with x2 graphics dies (GCDs), and one bridge interconnect paired with 256MB of Infinity Cache, the next-gen RDNA 2 flagship is reportedly going to be up to 3x faster than its predecessor. This info was shared by RedGamingTech the other day. As per the YouTuber’s sources, the FP32 compute throughput (in TFLOPs) is likely going to be thrice as much as the RX 6900 XT while the actual gaming performance will be in the 2-2.5x range.
Going by AMD’s official figures, the Radeon RX 7900 XT will have a single-precision throughput of 75 TFLOPs, up from just 25 TFLOPs on the 6900 XT. Furthermore, turns out that there won’t be a separate ML chiplet for AI-upscaling. Rather, the hardware will be likely paired with the cache/bridge interconnect.
The Radeon RX 7900 XT will pack up to a total of 15,360 shaders across 60 Work Group Processors (WGPs) paired with 16GB of GDDR6 memory across a 256-but bus. Each Graphics Die (GCD) features three Shader Engines which are made up of two Shader Arrays each. In turn, each Shader Array packs five WGPs containing eight SIMD units (vs four on RDNA 2). The two dies are connected by a bridge interconnect paired with 256MB of L3 “Infinity” Cache. According to the source, the GCDs will be fabbed on TSMC’s 5nm (N5) node while the MCD will be fabbed on the older 6nm (N6) node. Each die should come with a 128-bit bus (divided into eight controllers), resulting in an overall bus width of 256-bit and the same external bandwidth of 448GB/s as the RX 6800 XT/6900XT.
AMD’s RDNA 3 graphics architecture is expected to get a major overhaul at the front-end, with redesigned Work Group Processors in place of Compute Units, or Dual Compute Units. With RDNA 1 and 2, the WGPs were the basic units for workload scheduling (from CUs on GCN/Vega), but it looks like that is going to change again with Navi 3x. Dual Compute Units are being discarded in favor of wider Work Group Processors, packing as many as 256 stream processors across eight 32-wide SIMDs. This means that the wave32 format of scheduling will be retained, but the number of overall active waves will be increased.
Read more here: