Workshop: HUST-23: 10th International Workshop on HPC User Support Tools
Authors: Chun-Yaung Lu and Kent Milfeld (Texas Advanced Computing Center (TACC))
Abstract: The paper presents improvements in Remora (REsource Monitoring for Remote Applications), a user-oriented, lightweight system monitoring tool designed for modern HPC systems. Assessing application performance can be complicated. Gathering metrics from various components may overwhelm end users. Hence, some HPC users might be able to make their applications more performant if there are easy-to-use monitoring tools for non-HPC experts. Remora addresses this by providing simple tools for quick diagnostic assessments of an application’s resource usage, and offering flexible and adaptable workflow support. The new release of REMORA v2 provides performance updates and new features. Other improvement include RemoraPy, a Python wrapper, and RP-Stats, a JupyterLab-based GUI, enhancing data collection, visualization, and analysis capabilities.
Back to HUST-23: 10th International Workshop on HPC User Support Tools Archive Listing