Authors: Junmin Xiao (Institute of Computing Technology, Chinese Academy of Sciences); Chaoyang Shui (Institute of Computing Technology, Institute of Computing Technology, Chinese Academy of Sciences); and Di Cai, Kangyu Wang, Yunfei Pang, Mingyi Li, Hui Ma, and Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences)
Abstract: Global ocean data assimilation is a crucial technique to estimate the actual oceanic state by combining numerical model outcomes and observation data, which is widely used in climate research. Due to the imbalanced distribution of observation data in global ocean, the parallel efficiency of recent methods suffers from workload imbalance. When massive GPUs are applied for global ocean data assimilation, the workload imbalance becomes more severe, resulting in poor scalability. In this work, we propose a novel adaptive workload-balance scheduling strategy, assimilation, which successfully estimates the total workload prior to execution and ensures a balanced workload assignment. Further, we design a parallel dynamic programming approach to accelerate the schedule decision, and develop a factored dataflow to exploit the parallel potential of GPUs. Evaluation demonstrates that our algorithm outperforms the state-of-the-art method by up to 9.1x speedup. This work is the first to scale global ocean data assimilation to 4,000 GPUs.
Back to Technical Papers Archive Listing