One of the primary advantages of NVIDIA’s RTX 30 series “Ampere” GPUs is with respect to the memory bandwidth. The RTX 3080 manages a massive memory bandwidth of 760 GB/s thanks to the use of a wider 320-bit bus and GDDR6X memory. However, AMD’s RNDA 2 graphics cards may just avoid the drawbacks of having a lower memory bandwidth with a better cache hierarchy.
As per tests done by Twitter user, vyor, the RX 5700 based on 1st Gen Navi loses just around 20-25% performance on average when the memory speeds are halved. This can be explained on the basis of the cache hierarchy of the RDNA GPUs:
- For starters, the two CUs in a DCU share the local data cache (L0) for better cache hit rates and lower latency. In comparison, the older GCN and rival GeForce cards have separate data caches for each SM.
- Furthermore, Navi also includes another level of caching in the form of L1 which services the DCUs in a shader array.
- Lastly, the L2 cache has also been roughly doubled with RDNA 2 which further improves hit rates and masking the lower memory bandwidth.
- Even the GPRs are much wider in Navi. Each vector general-purpose register (vGPR) contains 32 lanes that are 32-bits wide (for FP32), and an SIMD contains a total of 1,024 vGPRs, 4X the number of registers as in GCN.