At GTC 2021, NVIDIA announced its first CPU codenamed Grace, after the scientist Grace Hopper aimed at future data center systems in parallel with its A100 successors. Similar to the company’s past processor designs in the mobile market, the Grace data center CPU is based on Arm’s next-gen Neoverse cores. Unlike traditional CPU designs, Grace will (mostly) come as an SoC paired with an NVIDIA GPU accelerator and a memory subsystem plus LPDDR5x memory. As per the company, the CPU will be capable of delivering a performance of 300+ in SPECrate2017_Int_base.
The Grace CPU has been built from the ground up for AI-based data center workloads such as neural network training and inference, allowing for an increase of as much as 10x when paired with traditional x86 CPUs and existing NVIDIA A100 GPUs.
One of the primary advantages of Grace is a closely-knit network of CPUs and GPUs in a 1:1 mesh-like configuration. This allows for a much higher memory-to-GPU bandwidth of as much as 2,000 GB/s, vastly superior to that supported by a traditional DDR4-based solution over a single x86 CPU. Furthermore, the use of NVLink 4.0 as the bridge between Grace CPUs and next-gen NVIDIA GPUs allows for a CPU-to-GPU bandwidth of up to 900GB/s, much higher than anything else on the market.
Grace is optimized for LPDDR5x memory which offers twice the bandwidth over DDR4 and an efficiency increase of 10x, while also providing a unified cache coherence with a single memory space by combining system and HBM memory to simply programmability. This is somewhat similar to AMD’s 3rd Gen Infinity architecture, although there are some key differences between the two.
NVIDIA is expected to launch the Arm-based Grace CPU SoCs sometime in 2023. The company’s Arm acquisition should be complete by then and we will also have the next-gen GPU architecture in the public domain.