SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

ACM Student Research Competition Poster Archive

Sensitivity of Black-Box Statistical Prediction of Lossy Compression Ratios for 3D Scientific Data

Student: Alexandra Poulos (Clemson University)
Supervisor: Jon Calhoun (Clemson University)

Abstract: Compression ratio estimation is an important optimization of I/O workflows processing terabytes of data. Applications such as compression auto-tuning or lossy compressor selection require a high-throughput, accurate estimation. Prior works that utilize sampling are fast but inaccurate, while approaches which do not use sampling are highly accurate but slow. Through sensitivity analysis we show that sampling a small number of moderately sized data blocks maintains fast data transfer and yields superior estimation accuracy compared to existing sampling approaches, and we use this to construct a new fast and accurate sampling method. In relation to non-sampling techniques, our method results in less than 10% degradation in estimation accuracy, while still maintaining the high throughput of the less accurate sampling methods.

ACM-SRC Semi-Finalist: no

Poster: PDF
Poster Summary: PDF

Back to Poster Archive Listing