SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Accelerating Data-Intensive Seismic Research Through Parallel Workflow Optimization and Federated Cyberinfrastructure


Workshop: The 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23) - Part 1 of 2

Authors: Marcus Adair, Ivan Rodero, and Manish Parashar (University of Utah; University of Utah, Scientific Computing and Imaging Institute (SCI)) and Diego Melgar (University of Oregon)


Abstract: Earthquake early warning systems use synthetic data from simulation frameworks like MudPy to train models for predicting the magnitudes of large earthquakes. MudPy, although powerful, has limitations: a lengthy simulation time to generate the required data, lack of user-friendliness, and no platform for discovering and sharing its data. We introduce FakeQuakes DAGMan Workflow (FDW), which utilizes Open Science Grid (OSG) for parallel computations to accelerate and streamline MudPy simulations. FDW significantly reduces runtime and increases throughput compared to a single-machine setup. Using FDW, we also explore partitioned parallel HTCondor DAGMan workflows to enhance OSG efficiency. Additionally, we investigate leveraging cyberinfrastructure, such as Virtual Data Collaboratory (VDC), for enhancing MudPy and OSG. Specifically, we simulate using Cloud bursting policies to enforce FDW job-offloading to VDC during OSG peak demand, addressing shared resource issues and user goals; we also discuss VDC’s value in facilitating a platform for broad access to MudPy products.





Back to The 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23) - Part 1 of 2 Archive Listing



Back to Full Workshop Archive Listing