CPUsGPUsNews

AMD Set to Announce Zen 3D Based Milan-X CPUs + Instinct MI200 GPUs: Watch the Live Event Here

AMD is all set to announce its next-generation of data center offerings in roughly 24 hours from now. We’re talking about the Zen 3D-based Milan-X processors, featuring 3D stacked V-Cache and the chiplet based Instinct MI200 GPU accelerators. Milan-X will retain the Zen 3 core and the N7 process from TSMC, and as such, can be thought of as a special refresh or niche stack, much like the upcoming Sapphire Rapids-SP with on-die HBM memory.

CPU NameCores/ThreadsBase ClockBoost ClockL3 Cache (V-Cache + L3 Cache)L2 CacheTDP
AMD EPYC 7773X64/1282.2 GHz3.5 GHz512 + 256 MB32 MB280W
AMD EPYC 7573X32/642.8 GHz3.6 GHz512 + 256 MB32 MB280W
AMD EPYC 7473X24/482.8 GHz3.7 GHz512 + 256 MB12 MB240W
AMD EPYC 7373X16/323.05 GHz3.8 GHz512 + 256 MB8 MB240W

Looking at the specs, everything’s basically identical to the vanilla Milan parts, including the base and boost clocks, the TDP as well as the L2 cache (other than the crapton of L3 cache). This means that performance gains (as already indicated earlier) will vary from application to application, and won’t be much pronounced in every workload.

You can watch the AMD Accelerated Data Center Keynote here

The exact specifications of the MI2150X have been shared. It’ll consist of a total of 110 CUs with a boost clock of 1.7GHz. This means that we’re likely looking at eight memory stacks, each featuring eight 2GB dies. This indicates a total bus width of 8,196-bits (1,024-bits x8 controllers), resulting in an overall bandwidth of 3.68 TB, roughly the same as the HBM variants of Sapphire Rapids-SP.

At the heart of the GPU core, there will be two 55 CU chiplets, resulting in an overall compute strength of 110 CU, with an impressive boost clock of 1.7GHz. Since Alderbaran can execute double-precision instructions (FP64) at native speeds, this will result in a double-precision throughput of 47.9 TFLOPs, an insane four times more than its predecessor, the MI100.

Even NVIDIA’s Ampere-based A100 Tensor core accelerator is capable of “only” 19.5 TFLOPs of FP64 compute. In terms of mixed-precision compute, we’re looking at 383 TFOPs of FP16 and BFLOAT16. In comparison, the MI100, topped out at “just” 184 and 92 TFLOPs in the two data types, respectively.

The MI250X will have a TDP of 500W which is a bit on the high side but is likely a result of the HBM memory. The MI250 should come will a lower boost clock and possibly lesser memory as well. A scalpel to the GPU core is unlikely but I wouldn’t rule it out.

The AMD Radeon Instinct MI200 GPU will, over the next year, begin to power three massive systems on three continents: the United States’ exascale Frontier system; the European Union’s pre-exascale LUMI system; and Australia’s petascale Setonix system.

Areej

Computer Engineering dropout (3 years), writer, journalist, and amateur poet. I started my first technology blog, Techquila while in college to address my hardware passion. Although largely successful, it was a classic example of too many people trying out multiple different things but getting nothing done. Left in late 2019 and been working on Hardware Times ever since.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button