HPC-GPT: Integrating Large Language Model for High-Performance Computing

SC23 Proceedings

Workshops Archive

HPC-GPT: Integrating Large Language Model for High-Performance Computing

Workshop: Workshop on Machine Learning with Graphs in High Performance Computing Environments

Authors: Xianzhong Ding (University of California, Merced); Le Chen (Iowa State University); Murali Emani (Argonne National Laboratory (ANL)); Chunhua Liao, Pei-Hung Lin, and Tristan Vanderbruggen (Lawrence Livermore National Laboratory); Zhen Xie (Argonne National Laboratory); and Alberto Cerpa and Wan Du (University of California, Merced)

Abstract: Large Language Models (LLMs), including the LLaMA model, have exhibited their efficacy across various general-domain natural language processing (NLP) tasks. However, their performance in high-performance computing (HPC) domain tasks has been less than optimal due to the specialized expertise required to interpret the model’s responses. In response to this challenge, we propose HPC-GPT, a novel LLaMA-based model that has been supervised fine-tuning using generated QA (Question-Answer) instances for the HPC domain. To evaluate its effectiveness, we concentrate on two HPC tasks: managing AI models and datasets for HPC, and data race detection. By employing HPC-GPT, we demonstrate comparable performance with existing methods on both tasks, exemplifying its excellence in HPC-related scenarios. Our experiments on open-source benchmarks yield extensive results, underscoring HPC-GPT’s potential to bridge the performance gap between LLMs and HPC-specific tasks.

Back to Workshop on Machine Learning with Graphs in High Performance Computing Environments Archive Listing

Back to Full Workshop Archive Listing