Reducing Memory Requirements for the IPU Using Butterfly Factorizations

SC23 Proceedings

Workshops Archive

Reducing Memory Requirements for the IPU Using Butterfly Factorizations

Workshop: PMBS23: The 14th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

Authors: SeyedKazem Shekofteh, Christian Alles, and Holger Fröning (Heidelberg University, Institute of Computer Engineering (ZITI))

Abstract: High Performance Computing (HPC) benefits from different improvements during last decades, specially in terms of hardware platforms to provide more processing power while maintaining the power consumption at a reasonable level. The Intelligence Processing Unit (IPU) is a new type of massively parallel processor, designed to speedup parallel computations with huge number of processing cores and on-chip memory components connected with high-speed fabrics. IPUs mainly target machine learning applications, however, due to the architectural differences between GPUs and IPUs, especially significantly less memory capacity on an IPU, methods for reducing model size by sparsification have to be considered. Butterfly factorizations are well-known replacements for fully-connected and convolutional layers. We examine how butterfly structures can be implemented on an IPU and study their behavior and performance compared to a GPU.

Back to PMBS23: The 14th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems Archive Listing

Back to Full Workshop Archive Listing