GPUs

NVIDIA Launches PCIe Variant of the A100 “Ampere” GPU Accelerator

NVIDIA has announced the PCIe variant of the A100 GPU accelerator based on the new Ampere microarchitecture. While the core specs and configuration are identical to the original SXM4 based A100 “Tensor Core GPU“, the bus interface and power draw have been changed. The PCIe version of the A100 supports up to PCIe 4.0 speeds and comes with a significantly reduced TDP of 250W.

A100
(PCIe)
A100
(SXM4)
V100
(PCIe)
P100
(PCIe)
FP32 CUDA Cores6912691251203584
Boost Clock1.41GHz1.41GHz1.38GHz1.3GHz
Memory Clock2.4Gbps HBM22.4Gbps HBM21.75Gbps HBM21.4Gbps HBM2
Memory Bus Width5120-bit5120-bit4096-bit4096-bit
Memory Bandwidth1.6TB/sec1.6TB/sec900GB/sec720GB/sec
VRAM40GB40GB16GB/32GB16GB
Single Precision19.5 TFLOPs19.5 TFLOPs14.1 TFLOPs9.3 TFLOPs
Double Precision9.7 TFLOPs
(1/2 FP32 rate)
9.7 TFLOPs
(1/2 FP32 rate)
7 TFLOPs
(1/2 FP32 rate)
4.7 TFLOPs
(1/2 FP32 rate)
INT8 Tensor624 TOPs624 TOPsN/AN/A
FP16 Tensor312 TFLOPs312 TFLOPs112 TFLOPsN/A
TF32 Tensor156 TFLOPs156 TFLOPsN/AN/A
Relative Performance (SXM Version)90%100%N/AN/A
InterconnectNVLink 3
6 Links? (300GB/sec?)
NVLink 3
12 Links (600GB/sec)
NVLink 2
4 Links (200GB/sec)
NVLink 1
4 Links (160GB/sec)
GPUGA100
(826mm2)
GA100
(826mm2)
GV100
(815mm2)
GP100
(610mm2)
Transistor Count54.2B54.2B21.1B15.3B
TDP250W400W250W300W
Manufacturing ProcessTSMC 7NTSMC 7NTSMC 12nm FFNTSMC 16nm FinFET
InterfacePCIe 4.0SXM4PCIe 3.0SXM
ArchitectureAmpereAmpereVoltaPascal

As per NVIDIA’s official documentation, the PCIe 4.0 variant offers 90% performance of the SMX4 version, all the while cutting down power by 150W. The interconnect speeds are the main differentiating factor between the two models, with the PCIe variant topping out at 300GB/s and 6 NVLink 3 GPU interconnects while the full-fledged SXM variant supporting up to 600GB/s and up to 12 GPU links via the NVLink 3 interface.

Other than this, both the GPUs are practically identical with the same GPU core, bus width and HBM2 memory. The new PCIe 4 variant should significantly increase the target market of the Ampere based A100 GPU accelerator. You can read more about the finer details of the A100 and the Ampere architecture powering it here:

NVIDIA Ampere Architectural Analysis: A Look at the A100 Tensor Core GPU

Areej Syed

Processors, PC gaming, and the past. I have been writing about computer hardware for over seven years with more than 5000 published articles. Started off during engineering college and haven't stopped since. Mass Effect, Dragon Age, Divinity, Torment, Baldur's Gate and so much more... Contact: areejs12@hardwaretimes.com.
Back to top button