NVIDIA May Announce its Next-Gen Ada Lovelace GPU Architecture at GTC 2022 (21st March)

NVIDIA is slated to launch its next-gen RTX 40 series graphics cards later this year (most likely September 2022). Based on the Ada Lovelace microarchitecture and TSMC’s N5 (5nm) process node, these GPUs are slated to offer more than twice as much performance as their preceding RTX 30 “Ampere” family. All this will, of course, come at a cost not only for consumers but NVIDIA as well. According to the company’s Q4 earnings report, by the end of the quarter, Team Green had spent up to $9 billion for inventory purchases and prepayments for future products.

Although the hard launch of the RTX 40 series parts won’t come any sooner than Q3 2022, the first GPU featuring the Ada Lovelace microarchitecture may be announced at GTC 2022. We’re talking about the successor to the A100. It’s worth noting that the A100 was fabbed on TSMC’s 7nm node, whereas the RTX 30 series leveraged Samsung’s 8nm process.

Data Center GPUNVIDIA Tesla P100NVIDIA Tesla V100NVIDIA A100
GPU CodenameGP100GV100GA100
GPU ArchitectureNVIDIA PascalNVIDIA VoltaNVIDIA Ampere
GPU Board Form Factor SXMSXM2SXM4
FP32 Cores / SM646464
FP32 Cores / GPU358451206912
FP64 Cores / SM323232
FP64 Cores / GPU179225603456
INT32 Cores / SMNA6464
INT32 Cores / GPUNA51206912
Tensor Cores / SMNA842
Tensor Cores / GPUNA640432
GPU Boost Clock1480 MHz1530 MHz1410 MHz

Looking at the GV100 and the GA100, it’d be fair to assume that the AD100 will feature a different SM floorplan as well, and possibly increase the SM count to over 130. Like always, the FP32: FP64 will be 1:1, with Tensor cores and sparse matrix multiplication getting special attention. Furthermore, unlike the RTX 40 series, it’ll be paired with HBM2e memory (over 100GB of it).

The RTX 4080, on the other hand, should feature 16GB of GDDR6X memory running at around 21Gbps, while the RTX 4090 should pack somewhere between 20-30GB of GDDR6X memory. In terms of specifications, we’re looking at an FP32 core count of up to 18,432. The AD102 flagship is rumored to feature 144 SMs distributed across 12 GPCs. That results in a raw compute gain of over 2.5x (90 TFLOPs) over the GA102, granted the core is running close to 2GHz.

ArchTuringAmpereAda LovelaceHopper
ProcessTSMC 12nmSam 8nm LPPTSMC 5nm3nm?
TFLOPs16.137.690 TFLOPs?150 TFLOPs+
Bus Width384-bit384-bit384-bit512-bit
LaunchSep 2018Sep 20Aug-Sep 20222024

