SC23 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Toward Foundation Models for Materials Science: The Open MatSci ML Toolkit


Workshop: Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)

Authors: Kin Long Kelvin Lee (Intel Corporation), Carmelo Gonzales (Intel Labs), Matthew Spellings (Vector Institute), Mikhail Galkin and Santiago Miret (Intel Labs), and Nalini Kumar (Intel Corporation)


Abstract: Artificial intelligence and machine learning have shown great promise in their ability to accelerate novel materials discovery. As researchers and domain scientists seek to unify and consolidate chemical knowledge, the case for models with potential to generalize across different tasks within materials science – so-called "foundation models" – grows with ambitions. This manuscript reviews our recent progress with development of Open MatSci ML Toolkit, and details experiments that lay the groundwork for foundation model research and development with our framework. Our key results show that for simple applications, pre-training appears to provide worse modeling performance than training models from random initialization. However, for more complex instances, such as when a model is required to learn across multiple datasets and types of targets simultaneously, the inductive bias from pre-training provides significantly better performance. This insight will hopefully inform subsequent efforts into creating foundation models for materials science applications.





Back to Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S) Archive Listing



Back to Full Workshop Archive Listing