GamingGPUsNews

NVIDIA RTX 4090 Final Specifications Leak Out: 16,128 Cores, 24GB GDDR6X, 450W, 2x Faster than the RTX 3090 [Report]

The specifications of NVIDIA’s Lovelace flagship have been supposedly finalized. The GeForce RTX 4090 leveraging the AD102 die will feature a total of 16,128 FP32 cores across 126 SMs, 63 TPCs, and 11 GPCs. This massive die will be paired with 24GB of 21Gbps GDDR6X memory across a 384-bit bus, the same as the RTX 3090 Ti. Lovelace will likely borrow some of the features of Hopper, especially Thread Block Memory Sharing which along with the massive 96MB of L2 cache drastically boost SM utilization and bandwidth, respectively.

In case you missed out on the Hopper Whitepaper, here’s a small primer on Thread Block Clusters and Distributed Shared Memory (DSM). To make scheduling on GPUs with over 100 SMs more efficient, Hopper and Lovelace will group every two thread blocks in a GPC into a cluster. The primary aim of Thread Block Clusters is to improve multithreading and SM utilization. These Clusters run concurrently across SMs in a GPC.

Thanks to an SM-to-SM network between the two threads blocks in a cluster, data can be efficiently shared between them. This is going to be one of the key features promoting scalability on Hopper and Lovelace which is a key requirement when you’re increasing the core/ALU count by over 50%.

Lastly, let’s not forget that the RTX 4090 won’t feature the full-fat AD102 die and yet offer twice as much performance as its predecessor. The TGP is eventually going to be “just” 450W, a far cry from the previously rumored 600-900W abominations. The RTX 4090 Ti which may launch later in the cycle with the fully enabled AD102 die is more likely to come with a 600W TGP.

GPUGA102AD102RTX 4090AD103RTX 4080AD104RTX 4070
ArchAmpereAda LovelaceAda LovelaceAda Lovelace
ProcessSam 8nm LPPTSMC 5nmTSMC 5nmTSMC 5nm
GPC712117755
TPC42726442403030
SMs8414412884806060
Shaders10,75218,43216,38410,7529,7287,6807,680
TP37.6~100 TFLOPs?83 TFLOPs~50 TFLOPs47 TFLOPs?~35 TFLOPs35 TFLOPs?
Memory24GB GDDR6X48GB GDDR6X24GB GDDR6X16GB GDDR6X
12GB GDDR6X
L2 Cache6MB96MB72MB64MB48MB
Bus Width384-bit384-bit256-bit160/192-bit
TGP350W600W450W450W285-340W300W285W
LaunchSep 2020Sept 22?Sept 22?Q1 2023?

Source:

Areej

Computer hardware enthusiast, PC gamer, and almost an engineer. Former co-founder of Techquila (2017-2019), a fairly successful tech outlet. Been working on Hardware Times since 2019, an outlet dedicated to computer hardware and its applications.