As computer hardware manages higher amounts of AI and HPC datasets, the loading times for various applications place heavier strains on performance. CPUs have historically been responsible for I/0, the act of processing data from storage to GPUs. However, fast GPUs are adversely affected by slow I/0. As computation adjusts from CPUs to GPUs, I/0 becomes more of a hindrance to overall performance.
How Does GPUDirect Storage Improve Performance?
GPUDirect Storage is a new technology that creates a straight data path between remote or local storage and GPU memory. It prevents the inconvenience of creating extra copies via the CPU memory’s bounce buffer. Instead, it allows storage to transfer data directly to or from the GPU without straining the CPU.
The storage location for GPUDirect Storage technology doesn’t matter. It’s manageable in a rack, enclosure, or storage area network (SAN). Direct Memory Access (DMA) utilizes a copy engine to transfer large data blocks over PCIe and free computer hardware for other work. DMA engines typically occur in GPUs, NVMe drivers, storage controllers, and other storage-related components.
How to Relieve I/0 Bottlenecks
As GPU-accelerated applications process increasing datasets, it becomes more relevant to relieve the congestion between storage and GPU memory. For instance, training a neural network requires access to multiple sets of files several times a day. Efficient data transfers to the GPU will significantly impact the training time for your AI model.
Deep learning training also involves checkpointing. This process saves trained weights to disks at various intervals. Because it’s in the crucial I/0 path, it’s possible to achieve faster model recovery by reducing associated overhead.
Scaled Storage and Bandwidth Possibilities for GPUs
A common theme of AI and data analytics is that they often use massive datasets to derive insights. Permitting DMA operations from drives enables faster memory access with lower latency, higher bandwidth, and unlimited storage.
The NVIDIA DGX-2 has two CPUs, each containing two instances of a PCIe subtree. Multiple paths from storage make it a suitable candidate for prototyping and testing GPUDirect Storage technology. It has one PCIe slot for every two GPUs. Each can hold a NIC at 10.5 GB/s or a RAID card at 14 GB/s.
Advantages of GPUDirect Storage
A vital benefit of GPUDirect Storage is fast data access across multiple sources. The use of system memory and internal NVMe does not rule out RAID storage or NVMe-oF. Bi-directional bandwidths enable advanced choreography, which may bring data from cached local disks or storage area networks.
They also facilitate collaboration with CPUs via data structures in the system memory. GPUDirect Storage provides value in various ways, including:
- Two to eight times higher bandwidth with direct data transfers between GPU and storage.
- Precise, low latency data transfers that avoid bounce buffers.
- Stable data transfers with increasing GPU concurrency.
Less interference to GPU load when computer hardware uses DMA engines near storage. Larger sizes of GPUDirect Storage support a higher ratio of bandwidth to fractional CPU use. In addition to being the most comprehensive bandwidth-computing engine, the GPU also becomes the hardware with the most IO bandwidth.
All the above benefits are attainable no matter the data storage location. GPUDirect storage becomes a force multiplier when your data processing systems switch to GPU execution. This shift is especially beneficial when system memory can no longer handle increasing dataset sizes.
Emerging technologies such as AI and machine learning require sophisticated computer hardware and software to achieve their full potential. GPUDirect Storage is an innovative option that improves overall performance by implementing efficient data transfer and management solutions. In addition to improving data analytics, this technology is bound to offer additional benefits such as convenience, durability, and value for money.