AMD Instinct MI200 (Aldebaran) to Pack 110 Compute Units, 10 Less than MI100?

AMD’s upcoming Instinct MI200 chiplet-based accelerator will feature a total of 110 compute units, instead of the earlier speculated 120. This is indicated by a recent update posted by AMD on the ROCmSoftwarePlatform / MIOpen GitHub repository. The below code defines Aldebaran (gfx90a) as a GPU with 110 compute units while its predecessor (MI100) features 120.

The primary difference is that the 110 CUs on the MI200 (Aldebaran) are spread across two dies (each with 55) while the MI100 had all the 120 units on a single die. This should bring down costs by quite a bit, allowing for higher-end SKUs with even more dies in the future. In addition to this, it’ll also support the native full-rate execution of FP64 instructions and packed FP32 instructions, essentially doubling throughput.

CDNA	Arcturus / MI100	Aldebaran / MI200
Active CUs	120 CU	110 CU? (2x 55 CU?)
FP32: FP64 Rate (Arcturus FP32 == 1)	1: 0.5	twenty one ?
FP32 Ops / clk ([CU] * [SP per CU] * [FP32 Rate] * 2 [Ops])	15360	28160? (2x 14080)?
FP64 Ops / clk ([CU] * [SP per CU] * [FP64 Rate] * 2 [Ops])	7680	14080? (2x 7040)?
Memory	HBM2 2.4Gbps	HBM2e
Memory Bus width	4096-bit	8192-bit? (2x 4096-bit)
Memory Size	32 GB	128GB? (2x 64GB?)

Aldebaran will be recognized by the OS as a single GPU, eliminating the need for multi-GPU optimizations and improving compatibility across different applications and platforms.

Via: Coelacanth’s Dream