According to multiple sources, AMD is working on 3D stacked server processors, codenamed Milan-X with high-bandwidth and impressive I/O specifications. While it’s unclear whether the processors will use stacked compute dies or memory (HBM2E?), the target is likely a custom Epyc processor tailored to specific kinds of workloads, such as HPC or other high-bandwidth applications. AMD is likely working with TSMC, leveraging its SoIC 3D packaging technology (or a derivative of it) to design this SKU.
AMD CEO, Dr. Lisa Su confirmed this at the Annual JPMorgan Tech Media Conference. Speaking at the event, she said that AMD will be using 2.5D packaging on Epyc CPUs in parallel with HBM and the Radeon Instinct GPUs to design a triplet architecture down the line.
When you think about sort of all that’s said about Moore’s Law slowing it means that you’re getting performance gains by going to smaller geometries, but not necessarily the same gains that you got a few years ago. So we were very early in sort of the idea of using 2.5D packaging with high-bandwidth memory together with our GPUs as well as using a triplet architecture to really get the incredible performance that we’re seeing with each generation of EPYC. And you’ll see us continue to innovate on that road map. So 3D trip stacking is definitely on the road map. We see it as another tool in the tool chest as you think about how do you put these different pieces together. And I think what you’ll also see is you might see different technologies used along the price curve. So you can imagine the highest-performance technologies can afford different elements. And then as you get into more cost-sensitive, you might not be able to use all that complexity. But think about it as AMD will push the envelope on 2.5D and 3D packaging as we go forward because it’s a key element to unlock that next level of performance. And again, we’ll talk a little bit more about that as we go through the next number of months as we roll out the next phase of our road maps.
This new 3D stacked processor will be launched by the end of this year for the server market, and we should (not confirmed) see it in action alongside the upcoming Radeon MI200 GPUs in the Frontier Supercomputer, and some other custom solutions. It should include HBM2 memory, with one CPU connected to four GPUs in each node.
The Frontier SC will be based on a unique design with each Epyc Trento CPU paired with four MI200 accelerators using the IF 3.0 interconnect, with each GPU directly connected to the CPU and every other GPU. This mesh design is what really makes the Trento-MI200 combo unique, as each chip has access to the data stored in the associated memory in a coherent manner, completely eliminating the need for direct management of memory copies on the program side.
Although Trento is being designed for Cray, it will also be available to other OEMs/ODMs. The main advantage of this platform is scalability, bandwidth, latency, and of course, the relatively simpler programmability thanks to the use of the Infinity Fabric 3.0 interconnect and a unified memory pool across the CPU and GPU.