AMD’s upcoming Instinct MI200 chiplet-based accelerator will feature a total of 110 compute units, instead of the earlier speculated 120. This is indicated by a recent update posted by AMD on the ROCmSoftwarePlatform / MIOpen GitHub repository. The below code defines Aldebaran (gfx90a) as a GPU with 110 compute units while its predecessor (MI100) features 120.
The primary difference is that the 110 CUs on the MI200 (Aldebaran) are spread across two dies (each with 55) while the MI100 had all the 120 units on a single die. This should bring down costs by quite a bit, allowing for higher-end SKUs with even more dies in the future. In addition to this, it’ll also support the native full-rate execution of FP64 instructions and packed FP32 instructions, essentially doubling throughput.
|CDNA||Arcturus / MI100||Aldebaran / MI200|
|Active CUs||120 CU||110 CU?|
(2x 55 CU?)
|FP32: FP64 Rate|
(Arcturus FP32 == 1)
|1: 0.5||twenty one ?|
|FP32 Ops / clk|
([CU] * [SP per CU] * [FP32 Rate] * 2 [Ops])
|FP64 Ops / clk|
([CU] * [SP per CU] * [FP64 Rate] * 2 [Ops])
|Memory Bus width||4096-bit||8192-bit?|
|Memory Size||32 GB||128GB?|
Aldebaran will be recognized by the OS as a single GPU, eliminating the need for multi-GPU optimizations and improving compatibility across different applications and platforms.
Via: Coelacanth’s Dream