Toward Collaborative Continuous Benchmarking for HPC

SC23 Proceedings

Workshops Archive

Toward Collaborative Continuous Benchmarking for HPC

Workshop: First International Workshop on HPC Testing and Evaluation of Systems, Tools, and Software (HPCTESTS 2023)

Authors: Olga Pearce (Lawrence Livermore National Laboratory, Texas A&M University); Alec Scott, Gregory Becker, Riyaz Haque, Nathan Hanford, and Stephanie Brink (Lawrence Livermore National Laboratory); Doug Jacobsen (Google LLC); Heidi Poxon (Amazon); Jens Domke (RIKEN); and Todd Gamblin (Lawrence Livermore National Laboratory)

Abstract: Benchmarking is integral to procurement of HPC systems, communicating HPC center workloads to HPC vendors, and verifying performance of the delivered HPC systems. Currently, HPC bench- marking is manual and challenging at every step, posing a high barrier to entry, and hampering reproducibility of the benchmarks across different HPC systems. We propose collaborative continuous benchmarking to enable functional reproducibility, automation, and community collaboration in HPC benchmarking. We define the minimal requirements for collaborative continuous benchmarking and develop a common language to streamline the interactions between HPC centers, vendors, and researchers. We demonstrate the initial implementation of collaborative continuous benchmarking, and introduce an open source continuous bench-marking repository, Benchpark, for community collaboration. We believe collaborative continuous benchmarking will help overcome the human bottleneck in HPC benchmarking, enabling better evaluation of our systems and enabling a more productive collaboration within the HPC community.

Back to First International Workshop on HPC Testing and Evaluation of Systems, Tools, and Software (HPCTESTS 2023) Archive Listing

Back to Full Workshop Archive Listing