SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Birds of a Feather

Enabling I/O and Computation Malleability in High-Performance Computing


Authors: Jesus Carretero (University Carlos III of Madrid, Spain), Estela Suarez (Jülich Supercomputing Centre), Martin Schulz (Technical University Munich), Martin Schreiber (Université Grenoble Alpes), Sergio Iserte (Barcelona Supercomputing Center (BSC)), Tapasya Patki (Lawrence Livermore National Laboratory), Henry Tufo (University of Colorado), Hans-Christian Hoppe (Forschungzentrum Juelich)

Abstract: Traditional interest in increasing the parallelism for individual jobs in HPC systems is conditioned by the diversity and dynamics of their resource demands at runtime. Malleability techniques can help to dynamically adapt resource usage to achieve maximum efficiency. Malleable HPC systems, however, present a series of fundamental research challenges in the fields of resource management, scheduling, malleability control, flexibilization of application structures, and data movement. All aforementioned issues will be discussed in the proposed Birds of a Feather session, which aims at building a community of developers and users around the topic of malleability in high-performance computing, networking, and storage.

Long Description: Traditional interest in increasing parallelism for individual jobs in HPC systems has been impressed by the variety and dynamicity of resource demands of jobs, both applications and workflows, at runtime. Malleability techniques can help to dynamically adapt resource usage dynamically to achieve maximum efficiency by adjusting the computation and storage needs of applications, on the one side, and the allocation of hardware resources to them, on the other, when applications enter into execution phases requiring less or more - or different - computational or storage resources than those currently allocated. Malleable HPC systems, however, face a series of fundamental research challenges, such as resource management, scheduling, malleability control, flexibilization of application structures, and data movement. All aforementioned issues will be addressed in the proposed Birds of a Feather session, which aims at building a community of developers and users around the topic of malleability in high-performance computing, networking and storage.

Goal: The goal of this BoF session is to discuss malleability techniques and their impact on applications and systems. We will use the BoF to solicit input from interested parties to drive the development of future academic and commercial solutions aimed at supporting malleability in computing and I/O, with the final objective of including them in standards, such as MPI or PMIx.

Topics: Malleable systems, however, face a series of fundamental research challenges, including: who initiates changes? How is it communicated to applications? How to determine the optimal usage of the available resources? How can applications cope with dynamically changing resources? What should malleable programming models and abstractions look like? How to design scalable resource management frameworks for malleable systems? Which resources may benefit from malleability, and which (if any) should still be managed statically? To advance in the solutions of those challenges, the BoF session will focus on the following topics of discussion: System architecture considerations to enable efficient implementation of malleability; Runtimes, parallel programming models and techniques, and libraries supporting malleability; Malleable scheduling and load distribution considering multicriteria aspects, such as computing, I/O, fault tolerance, and energy efficiency; Potential usage of AI techniques to steer malleability in systems and applications; Support for malleable applications in performance, debugging and correctness tools.

We will make dissemination through social networks, email and web to attract public to the event. We will also use networks related to supercomputing centers around the world to attract experts and users with interest in this topic.


Website: https://admire-eurohpc.eu/sc23-bof-enabling-i-o-and-computation-malleability-in-high-performance-computing/





Back to Birds of a Feather Archive Listing