SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Technical Papers Archive

Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Methods


Authors: Jingrong Zhang, Akira Naruse, Xipeng Li, and Yong Wang (NVIDIA Corporation)

Abstract: The top-K problem is an essential part of many important applications in scientific computing, information retrieval, etc. As data volume grows rapidly, high-performance parallel top-K algorithms become critical. We propose two parallel top-K algorithms, AIR top-K (Adaptive and Iteration-fused Radix top-K) and GridSelect, for GPU. AIR top-K employs an iteration-fused design to minimize CPU-GPU communication and device data access. Its adaptive strategy eliminates unnecessary device memory traffic automatically under various data distributions. GridSelect can process data on-the-fly. It adopts a shared queue and parallel two-step insertion to decrease the frequency of costly operations. We comprehensively compare 8 open-source GPU implementations and our methods for a wide range of problem sizes and data distributions. For batch sizes 1 and 100, respectively, AIR top-K shows 1.98-21.48X and 8.01-574.78X speedup over previous radix top-K algorithm, and 1.44-7.34X and 1.38-31.91X speedup over state-of-the-art methods. GridSelect shows up to 882.29X speedup over its baseline.




Back to Technical Papers Archive Listing