SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Technical Papers Archive

A Quantitative Approach for Adopting Disaggregated Memory in HPC Systems


Authors: Jacob Wahlgren and Gabin Schieffer (KTH Royal Institute of Technology), Maya Gokhale (Lawrence Livermore National Laboratory), and Ivy B. Peng (KTH Royal Institute of Technology)

Abstract: Memory disaggregation has recently been adopted in major data centers to improve resource utilization, driven by cost and sustainability. Meanwhile, studies on large-scale HPC facilities have also highlighted memory under-utilization. A promising and non-disruptive option for memory disaggregation is rack-scale memory pooling, where node-local memory is supplemented by shared memory pools. This work outlines the prospects and requirements for adoption and clarifies several misconceptions. We propose a quantitative method for dissecting application requirements on the memory system in three levels, moving from general, to multi-tier memory, and then to memory pooling. We also provide tools to facilitate the quantitative approach. We evaluated a set of representative HPC workloads on an emulated platform. Our results show that interference in memory pooling has varied application impact, depending on access ratio and arithmetic intensity. Finally, our method is applied in two case studies to show benefits at both the application and system level.




Back to Technical Papers Archive Listing