Workshop: EduHPC-23: Workshop on Education for High Performance Computing
Authors: Jack Marquez and Befikir Bogale (University of Tennessee), Aashish Pandey (University of North Texas), Nigel Tan and Lauren Whitnah (University of Tennessee), Sanjukta Bhowmick (University of North Texas), and Michela Taufer (University of Tennessee)
Abstract: Increasing performance in data workflows can cause non-deterministic communication. Non-determinism can seriously affect software correctness and compromise reproducibility in scientific discovery. We design and implement tutorial modules to demonstrate the impact of non-determinism in data science workflows. We use ANACIN-X, a framework of test cases and tools for analytics and visualization. By completing our modules, students, researchers, and data science professionals will understand non-determinism, how it affects their applications, how to quantify it, and how to identify its root sources.
Back to EduHPC-23: Workshop on Education for High Performance Computing Archive Listing