AMD Announces the MI200 Data Center GPU: 6nm Chiplet Design w/ 58 Billion Transistors and 14,000 Cores

Alongside the Milan-X and Zen 4 Epyc processors, AMD today also unveiled the MI200 data center GPU (accelerator). Packing a chiplet based design, incorporating up to 14,080 stream processors across 220 CUs and two dies/chiplets, we’re looking at the most advanced accelerator ever designed. Each chiplet features eight shader engineers (total 16) with 16CUs (capable of full-rate FP64 and packed FP32 compute) each and a 2nd Gen Matrix Engine for mixed-precision compute (FP16 and BF16).

The following technologies power the Instinct MI200 lineup:

In terms of memory and bandwidth, we’re looking at HBM2e modules. Eight channels of 1,024-bit memory controllers are connected to 16GB HBM2e stacks (8-hi), resulting in an overall capacity of 128GB (vs 80GB on the NVIDIA A100). The memory chips will run at 3.2Gbps, resulting in an overall bandwidth of 3.2TB/s, once again a good deal more than the 2TB/s possible with the A100 80GB.

The MI200 lineup consists of three SKUs, the OAM exclusive MI250 and MI250X, and the PCIe-based MI210. The MI250X consists of the fully enabled die, with 14,080 cores and an FP64 compute throughput of 383 TFLOPs. The MI215 cuts down the core count to 13,312, resulting in an FP64 throughput of 362.1 TFLOPs. The exact configuration of the MI210 remains unknown.

AMD showed off some promising numbers compared to NVIDIA’s A100 SXM. We’re looking at gains of up to 3x in HPC workloads from AMG, Quicksilver, and HPL. It’ll be interesting to see how NVIDIA’s own MCM design “Hopper” holds up against the MI200.

Exit mobile version