Meet the DPU (Data Processing Unit): A New Kind of Processor for Data Centers

Till now we’ve had two kinds of processors: GPUs (graphics processing unit) and CPUs (Central Processing Unit). The former is used for highly parallelized, bandwidth-consuming workloads and accelerating high-precision compute applications while the latter excels at sequential, latency-sensitive tasks. DPUs or Data Processing Units are the third kind of processors that NVIDIA is betting on. As per the company’s definition, a DPU is:

A high-performance network interface capable of parsing, processing, and efficiently transferring data at line rate, or the speed of the rest of the network, to GPUs and CPUs

As you mind suspect, a DPU is mainly a network interface used to accelerate data transfers across networks at the same speed as GPUs and CPUs, reducing any bottlenecks imposed on the two by inter or intra-network data transfers.

A DPU is programmable SoC that comprises an ARM-based multicore CPU tightly coupled to the other components of the chip. A DPU can be used as a stand-alone embedded processor, but it’s usually incorporated in a SmartNIC, a network interface controller used in servers. In other words, this is NVIDIA marketing Mellanox’s SmartNICs. The acquisition costed Team Green a fortune and it’s only fair that it’s treated as a core division of the company.

In NVIDIA’s own terms, there are 10 capabilities the network data path acceleration engines (or DPUs) need to be able to deliver:

  • Data packet parsing, matching, and manipulation to implement an open virtual switch (OVS)
  • RDMA data transport acceleration for Zero Touch RoCE
  • GPU-Direct accelerators to bypass the CPU and feed networked data directly to GPUs (both from storage and from other GPUs)
  • TCP acceleration including RSS, LRO, checksum, etc
  • Network virtualization for VXLAN and Geneve overlays and VTEP offload
  • Traffic shaping “packet pacing” accelerator to enable multi-media streaming, content distribution networks, and the new 4K/8K Video over IP (RiverMax for ST 2110)
  • Precision timing accelerators for telco Cloud RAN such as 5T for 5G capabilities
  • Crypto acceleration for IPSEC and TLS performed inline so all other accelerations are still operation
  • Virtualization support for SR-IOV, VirtIO and para-virtualization
  • Secure Isolation: root of trust, secure boot, secure firmware upgrades, and authenticated containers and application life cycle management


Computer hardware enthusiast, PC gamer, and almost an engineer. Former co-founder of Techquila (2017-2019), a fairly successful tech outlet. Been working on Hardware Times since 2019, an outlet dedicated to computer hardware and its applications.

Related Articles

Back to top button