SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research Posters Archive

Transfer Learning Workflow for High-Quality I/O Bandwidth Prediction with Limited Data

Authors: Dmytro Povaliaiev (RWTH Aachen University); Radita Liem (RWTH Aachen University, IT Center); Julian Kunkel (University of Göttingen, GWDG, Germany); Jay Lofstead (Sandia National Laboratories); and Philip Carns (Argonne National Laboratory (ANL))

Abstract: The I/O performance prediction is challenging due to multiple intertwined variables inside a cluster. This situation makes I/O performance prediction a strong candidate for using machine learning because of the complex variables involved. However, making a high-quality prediction requires a large amount of equivalent-quality data, and collecting it is a big challenge for most data centers.

In this project, we explore transfer learning to predict the I/O performance by utilizing the publicly available I/O performance data in Darshan logs from the NCSA's Blue Waters supercomputer. We devise a workflow to train a neural network model as a base to predict the POSIX I/O bandwidth of other clusters (CLAIX18 and Theta). With less than 1% of the data needed to build the base model, our experiment shows that our transfer learning workflow can predict the I/O bandwidth of another system with a mean absolute error better or equivalent to the state-of-the-art.

Best Poster Finalist (BP): no

Poster: PDF
Poster summary: PDF

Back to Poster Archive Listing