NVIDIA’s next-gen Ada Lovelace GPUs will “easily” offer twice as much performance as their preceding Ampere counterparts. According to @kopite7kimi, the RTX 4090 won’t have any difficulty beating the RTX 3090, even with tame core clocks, and a partially disabled AD102 die. This means that the next-gen GeForce flagship will be roughly 2x faster than even the RTX 3090 Ti. A 4090 Ti, if at all released, will leverage the full-fledged AD102 die and higher boost clocks, near the end of the product cycle.
The AD102 die will feature 12 GPCs, a 70% and 50% increase over the RTX 3090/3090 Ti (GA102), and the GA100/H100, respectively. The overall FP32+INT32 ALU or core count per SM is rumored to grow by 50% over Ampere, resulting in a 33% boost in warp and thread counts. The ROP count per GPC too is slated to double while the L1 cache will be extended to 192KB (+50%). Finally, the L2 cache is set to get a massive upgrade, going from just 6MB on the RTX 3090/3090 Ti to 96MB on the RTX 4090/4090 Ti.
Just by analyzing the sheer increase in ALUs and cache, you can see how NVIDIA will be able to achieve the 2x performance gain over Ampere. Add a finer process node (TSMC N4), microarchitecture improvements, and higher boost clocks to the mix, and the RTX 4090 might offer the largest generational jump in performance we’ve seen in quite a while.