SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

ACM Student Research Competition Poster Archive

A Formal Specification of Tensor Cores via Satisfiability Modulo Theories


Student: Benjamin Valpey (University of Rochester)
Supervisor: Sreepathi Pai (University of Rochester)

Abstract: In this work, we explore how to replicate the behavior of undocumented hardware units -- in this case, NVIDIA's Tensor Cores -- and reason about them.

While prior work has employed manual testing to identify hardware behavior, we show that SMT can be used to generate inputs that can discriminate between different hardware implementation choices. We argue that SMTLIB, the language specification for SMT solvers, is well suited for exposing hardware implementations.

Using our method, we create a formal specification of the tensor cores on NVIDIA's Volta architecture. We confirm many of the findings of previous studies on tensor cores, but also identify two discrepancies: we find that the hardware does not use IEEE-754 round-to-zero for accumulation and that the 5-term accumulator requires 3 extra bits for carry out since it does not normalize intermediate sums.

The work will be presented in person using the poster as a visual aid.


ACM-SRC Semi-Finalist: no

Poster: PDF
Poster Summary: PDF


Back to Poster Archive Listing