SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research Posters Archive

Optimizing Workflow Performance by Elucidating Semantic Data Flow

Authors: Meng Tang (Illinois Institute of Technology), Nathan R. Tallent (Pacific Northwest National Laboratory (PNNL)), and Anthony Kougkas and Xian-He Sun (Illinois Institute of Technology)

Abstract: Distributed scientific workflows are becoming data-intensive, and the data movement through storage systems often causes bottleneck. Therefore, it is critical to understand data flow. Many scientific datasets incorporate domain semantics with formats like HDF and NetCDF, enhancing the interpretability and context of the data for analysis. We shed new insights on workflow bottlenecks by understanding how semantic data sets flow through storage. We unveil a fresh perspective with careful runtime measurement, recovering the mapping of domain semantics to low-level I/O operations, and effective visualization and analysis of semantic flows.

Best Poster Finalist (BP): no

Poster: PDF
Poster summary: PDF

Back to Poster Archive Listing