Authors: Meng Tang (Illinois Institute of Technology), Nathan R. Tallent (Pacific Northwest National Laboratory (PNNL)), and Anthony Kougkas and Xian-He Sun (Illinois Institute of Technology)
Abstract: Distributed scientific workflows are becoming data-intensive, and the data movement through storage systems often causes bottleneck. Therefore, it is critical to understand data flow. Many scientific datasets incorporate domain semantics with formats like HDF and NetCDF, enhancing the interpretability and context of the data for analysis. We shed new insights on workflow bottlenecks by understanding how semantic data sets flow through storage. We unveil a fresh perspective with careful runtime measurement, recovering the mapping of domain semantics to low-level I/O operations, and effective visualization and analysis of semantic flows.
Best Poster Finalist (BP): no
Poster: PDF
Poster summary: PDF
Back to Poster Archive Listing