Authors: Ed Karrels (University of Illinois), Lei Huang (Texas Advanced Computing Center (TACC)), Yuhong Kan and Ishank Arora (University of Texas), Yinzhi Wang (Texas Advanced Computing Center (TACC)), Daniel S. Katz and William Gropp (University of Illinois), and Zhao Zhang (Texas Advanced Computing Center (TACC))
Abstract: A burst buffer is commonly deployed on large-scale supercomputers to bridge the performance gap between the shared file system and the I/O needs of modern supercomputing applications. Existing I/O sharing methods either require resource isolation, offline profiling, or repeated execution that significantly limit the utilization and applicability of these systems. Here we present ThemisIO, a policy-driven I/O sharing framework for a remote-shared burst buffer. ThemisIO can accurately and efficiently allocate I/O cycles among applications purely based on real-time I/O behavior, without requiring user-supplied information or offline-profiled application characteristics. By exploiting a statistical token-based strategy, ThemisIO can precisely balance I/O cycles between applications via time slicing to enforce processing isolation, enabling a variety of fair sharing policies. Our experiments show that ThemisIO sustains 13.5–13.7% higher I/O throughput and 19.5–40.4% lower performance variation than existing algorithms. For applications, ThemisIO significantly reduces or nearly eliminates the slowdown caused by I/O interference.
Back to Technical Papers Archive Listing