SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Adaptive Stopping Rule for Performance Measurements


Workshop: PMBS23: The 14th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

Authors: Viyom Mittal, Pedro Bruel, Dejan Milojicic, and Eitan Frachtenberg (Hewlett Packard Enterprise (HPE))


Abstract: Performance variability in complex computer systems is a major challenge for accurate benchmarking and performance characterization, especially for tightly-coupled large-scale high-performance computing systems. Point summaries of performance may be both uninformative, if they do not capture the full richness of its behavior, and inaccurate, if they are derived from an inadequate sample set of measurements. Determining the correct sample size requires balancing tradeoffs of computation, methodology, and statistical power.

We treat the performance distribution as the primary target of the performance evaluation, from which all other metrics can be derived. We propose and evaluate a meta-heuristic that dynamically characterizes the performance distribution, determining when enough samples have been collected to approximate the true distribution. Compared to fixed stopping criteria, this adaptive method can be more efficient in resource use and more accurate. Importantly, it requires no advance assumptions about the system under test or its performance characteristics.





Back to PMBS23: The 14th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems Archive Listing



Back to Full Workshop Archive Listing