GPUsNews

NVIDIA Launches Data Center GPUs based on GA102 Ampere Gaming Core (Same as RTX 3080/3090)

NVIDIA today launched two new Ampere-based Data Center GPUs (accelerators), namely the A10 and A30. Out of the two, the former is more interesting as it based on the 8nm GA102 core that powers the GeForce RTX 3080 and 3090 (and soon the 3080 Ti). This is a bit surprising as there has been a very limited supply of GA102 chips. The fact that NVIDIA is now diverting part of its GA102 supply to the data center market means that the number of chips reserved for gamers will be even lower.

Similar to the gaming graphics cards, the A10 uses GDDR6 memory, 24GB of it just like the 3090 while the A30 is based on an unspecified variant of the GA100 (likely a cut-down SKU) with the same amount of HBM 2 memory. The NVIDIA A10 Tensor Core GPU leverages the GA102-890 core which features 72 SMs or 9,216 FP32 cores. That’s lesser than the 82 SMs or 10,496 cores on the 3090 but more than the 68 SMs (8,704 cores) on the RTX 3080.

This means that NVIDIA is salvaging GA102 dies that aren’t good enough for the RTX 3090 but at the same time can’t be used for the 3080 without disabling additional SMs. It has a base clock of 885MHz and a boost of 1,695MHz. Once again, as you can see, the lower boost clock is another contributing factor to the use of these parts for the Tensor line. The memory buffer is 24GB strong, but instead of GDDR6X, it’s paired with standard GDDR6 memory (likely as a result of shortages) along with a 384-bit bus, resulting in a bandwidth of 600GB/s.

A10A30
GPUGA102-890GA100
FP645.2 teraFLOPS
FP64 Tensor Core10.3 teraFLOPS
FP3231.2 teraFLOPS10.3 teraFLOPS
TF32 Tensor Core62.5 teraFLOPS | 125 teraFLOPS*82 teraFLOPS | 165 teraFLOPS*
BFLOAT16 Tensor Core125 teraFLOPS | 250 teraFLOPS*165 teraFLOPS | 330 teraFLOPS*
FP16 Tensor Core125 teraFLOPS | 250 teraFLOPS*165 teraFLOPS | 330 teraFLOPS*
INT8 Tensor Core250 TOPS | 500 TOPS*330 TOPS | 661 TOPS*
INT4 Tensor Core500 TOPS | 1,000 TOPS*661 TOPS | 1321 TOPS*
RT Core72 RT Cores
Encode/decode1 encoder
2 decoder (+AV1 decode)
1 optical flow accelerator (OFA)
1 JPEG decoder (NVJPEG)
4 video decoders (NVDEC)
GPU memory24GB GDDR624GB HBM2
GPU memory bandwidth600GB/s933GB/s
InterconnectPCIe Gen4 64GB/sPCIe Gen4: 64GB/s
Third-gen NVLINK: 200GB/s**
Form factorsSingle-slot, full-height, full-length (FHFL)Dual-slot, full-height, full-length (FHFL)
Max thermal design power (TDP)150W165W
Multi-Instance GPU (MIG)4 GPU instances @ 6GB each
2 GPU instances @ 12GB each
1 GPU instance @ 24GB
vGPU software supportNVIDIA Virtual PC, NVIDIA Virtual Applications, NVIDIA RTX Virtual
Workstation, NVIDIA Virtual Compute Server
NVIDIA AI Enterprise for VMware
NVIDIA Virtual Compute Serve

The A30 is based on the same GA100 GPU as the A100. NVIDIA hasn’t revealed the core count but we’re definitely looking at a cut-down SKU paired with 24GB of HBM2 memory running at 1,215MHz across a 3,072-bit wide bus (3 HBM stacks). The GPU core has a base clock of 930MHz and a boost of 1,440MHz, with the memory bandwidth pegged at 933GB/s.

Areej

Computer Engineering dropout (3 years), writer, journalist, and amateur poet. I started my first technology blog, Techquila while in college to address my hardware passion. Although largely successful, it was a classic example of too many people trying out multiple different things but getting nothing done. Left in late 2019 and been working on Hardware Times ever since.
Back to top button