SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Dask-Extended External Tasks for HPC/ML In Transit Workflows


Workshop: 6th International Workshop on Emerging Parallel Distributed Runtime Systems and Middleware

Authors: Amal Gueroudji (Argonne National Laboratory (ANL)), Julien Bigot (Atomic Energy and Alternative Energies Commission (CEA)), Bruno Raffin (INRIA), and Robert Ross (Argonne National Laboratory (ANL))


Abstract: In situ workflows are inescapable to fully leverage exascale architectures. They can however be complex to build as simulation and data analytics come from two different software ecosystems with their own paradigms. We extend deisa by introducing the concept of external tasks to support the description of analytics graphs spanning multiple timesteps ahead of time while improving scalability. This new approach leads to straightforward support for contracts to limit the data transferred to that actually analyzed in a given execution. We implement this approach using Dask and MPI and evaluate it using an in-transit workflow that uses an unsupervised ML model. We compare our work to plain Dask and to the previous version of deisa. Our work performs better, up to ×7, for the simulation and ×3, for the analytics compared to deisa and ×18 less costly compared to plain Dask. All of those with similar development efforts.





Back to 6th International Workshop on Emerging Parallel Distributed Runtime Systems and Middleware Archive Listing



Back to Full Workshop Archive Listing