Thursday, November 12th
1:45-3:15
Session B-11: Storage for Model Training and Execution (AI/ML Track)
Organizer: Nisha Talagala, CEO, Pyxeda AI

Paper Title: Accelerating the Data Path to the GPU for AI and Beyond

Paper Abstract: As workflows shift away from the CPU in GPU-centric systems, the data path from storage to GPUs increasingly becomes the bottleneck. NVIDIA and its partners are relieving that bottleneck with a new technology called GPUDirect Storage that includes a new set of “cuFile” interfaces. When partners are enabled with GPUDirect Storage, the Direct Memory Access engine in a NIC or local storage is able to move data directly to and from GPU memory, rather than going through a bounce buffer in the CPU. This can improve bandwidth, reduce latency, reduce CPU-side memory management overheads, and reduce interference with CPU utilization. GPUDirect Storage was revealed for the first time at the 2019 Flash Memory Summit. In this talk, we’ll show what’s happened since then. We’ll illustrate the benefits of GPUDirect Storage with recent results from demos and proof points in AI, data analytics, and visualization. Technical enhancements will be described, including a compatibility mode that allows the same APIs to be used even when all software components and support are not in place.

Paper Author: Kiran Modukuri, Sr Software Engineer, NVIDIA

Author Bio: Kiran Modukuri is a Principal Software Engineer at NVIDIA where he works on accelerating IO pipelines. He is the Architect of the GPUDirect Storage product. Before joining NVIDIA, he was a Software Engineer at Netapp. He earned a Master’s degree in computer science at the University of Arizona. He has over 1t5 years experience in the technology industry.