SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Technical Papers Archive

Calculon: a Methodology and Tool for High-Level Codesign of Systems and Large Language Models


Authors: Mikhail Isaev (Georgia Institute of Technology), Nic McDonald and Larry Dennison (NVIDIA Corporation), and Richard Vuduc (Georgia Institute of Technology)

Abstract: This paper presents a parameterized analytical performance model of transformer-based Large Language Models (LLMs) for guiding high-level algorithm-architecture codesign studies. This model derives from an extensive survey of performance optimizations that have been proposed for the training and inference of LLMs; the model's parameters capture application characteristics, the hardware system, and the space of implementation strategies. With such a model, we can systematically explore a joint space of hardware and software configurations to identify optimal system designs under given constraints, like the total amount of system memory. We implemented this model and methodology in a Python-based open-source tool called Calculon. Using it, we identified novel system designs that look significantly different from current inference and training systems, showing quantitatively the estimated potential to achieve higher efficiency, lower cost, and better scalability.




Back to Technical Papers Archive Listing