GPUsNews

NVIDIA Launches PCIe Variant of the A100 “Ampere” GPU Accelerator

NVIDIA has announced the PCIe variant of the A100 GPU accelerator based on the new Ampere microarchitecture. While the core specs and configuration are identical to the original SXM4 based A100 “Tensor Core GPU“, the bus interface and power draw have been changed. The PCIe version of the A100 supports up to PCIe 4.0 speeds and comes with a significantly reduced TDP of 250W.

A100
(PCIe)
A100
(SXM4)
V100
(PCIe)
P100
(PCIe)
FP32 CUDA Cores6912691251203584
Boost Clock1.41GHz1.41GHz1.38GHz1.3GHz
Memory Clock2.4Gbps HBM22.4Gbps HBM21.75Gbps HBM21.4Gbps HBM2
Memory Bus Width5120-bit5120-bit4096-bit4096-bit
Memory Bandwidth1.6TB/sec1.6TB/sec900GB/sec720GB/sec
VRAM40GB40GB16GB/32GB16GB
Single Precision19.5 TFLOPs19.5 TFLOPs14.1 TFLOPs9.3 TFLOPs
Double Precision9.7 TFLOPs
(1/2 FP32 rate)
9.7 TFLOPs
(1/2 FP32 rate)
7 TFLOPs
(1/2 FP32 rate)
4.7 TFLOPs
(1/2 FP32 rate)
INT8 Tensor624 TOPs624 TOPsN/AN/A
FP16 Tensor312 TFLOPs312 TFLOPs112 TFLOPsN/A
TF32 Tensor156 TFLOPs156 TFLOPsN/AN/A
Relative Performance (SXM Version)90%100%N/AN/A
InterconnectNVLink 3
6 Links? (300GB/sec?)
NVLink 3
12 Links (600GB/sec)
NVLink 2
4 Links (200GB/sec)
NVLink 1
4 Links (160GB/sec)
GPUGA100
(826mm2)
GA100
(826mm2)
GV100
(815mm2)
GP100
(610mm2)
Transistor Count54.2B54.2B21.1B15.3B
TDP250W400W250W300W
Manufacturing ProcessTSMC 7NTSMC 7NTSMC 12nm FFNTSMC 16nm FinFET
InterfacePCIe 4.0SXM4PCIe 3.0SXM
ArchitectureAmpereAmpereVoltaPascal

As per NVIDIA’s official documentation, the PCIe 4.0 variant offers 90% performance of the SMX4 version, all the while cutting down power by 150W. The interconnect speeds are the main differentiating factor between the two models, with the PCIe variant topping out at 300GB/s and 6 NVLink 3 GPU interconnects while the full-fledged SXM variant supporting up to 600GB/s and up to 12 GPU links via the NVLink 3 interface.

Other than this, both the GPUs are practically identical with the same GPU core, bus width and HBM2 memory. The new PCIe 4 variant should significantly increase the target market of the Ampere based A100 GPU accelerator. You can read more about the finer details of the A100 and the Ampere architecture powering it here:

NVIDIA Ampere Architectural Analysis: A Look at the A100 Tensor Core GPU

Areej

Computer Engineering dropout (3 years), writer, journalist, and amateur poet. I started Techquila while in college to address my hardware passion. Although largely successful, it suffered from many internal weaknesses. Left and now working on Hardware Times, a site purely dedicated to. Processor architectures and in-depth benchmarks. That's what we do here at Hardware Times!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to top button