Authors: Antonino Tumeo (Pacific Northwest National Laboratory (PNNL)), José Moreira (IBM Corporation), Timothy Mattson (Intel Corporation), John Feo (Pacific Northwest National Laboratory (PNNL)), Vito Giovannni Castellana (Pacific Northwest National Laboratory (PNNL)), Benjamin Brock (Intel Labs), Timothy Davis (Texas A&M University), William Harrod (Intelligence Advanced Research Projects Activity (IARPA))
Abstract: Government agencies, industry and academia are demanding a new generation of tools to efficiently solve large scale analytics problems in a variety of business, scientific and national security applications. This BoF gathers the community developing high-performance frameworks and workflows for large scale graph analytics to survey current approaches, identify new challenges and opportunities, and discuss interoperability of emerging infrastructures. A central goal is developing requirements and recommendations for future tools. As in previous editions, this BoF will explore and compare and contrast conventional implementations as well as algebraic approaches, inviting the GraphBLAS community to discuss its state and evolution.
Long Description: Activity in graph analytics is growing rapidly in government, industry and academia. Large scale graph problems require ever growing compute power and impose significant requirements on modern supercomputing architectures. The development of graph toolkits and libraries, their interoperability, and composability with other analytic platforms is critical to many scientific, data, and security domains. This BOF, held previously at SC17, ’18, ’19, ’21, '22, has consistently attracted over 100 attendees (attendance was lower in ’21 as we supported a hybrid format) and dozens of requests from acclaimed researchers and practitioners to speak. The panel sessions have been lively and intense.
This BOF gathers the community developing high-performance frameworks and workflows for large scale graph analytics to survey current approaches, identify new challenges and opportunities, and discuss interoperability of emerging infrastructures. A central goal is developing requirements and recommendations for future tools. In particular, we want to address the new and upcoming challenges in large scale graph analytics applications: the support for streaming graphs, the ability to deal with attributed graphs (that couple graphs with dense tables of attributes), the need to integrate the graph methods within broader machine learning frameworks, and the need to better support irregular data structures and graph methods in scientific simulation frameworks. Current and future graph toolkits will have to evolve to handle these new requirements and domains.
The BOF will include the latest work of the GraphBLAS community reporting on the performance and capabilities of extent implementations, the pros and cons of algebraic approaches, and the status of the new GraphBLAS C++ API. The GraphBLAS user community will present on key design patterns and requested features.
Our lineup of speakers will touch key themes such as applications, use cases, programming models, application programming interfaces and libraries, data structures and algorithms, and integration of tools, including common data structures, data storages, and data frames. The discussions and panels will delve into the dynamic runtime technologies needed to make graph toolkits and/or sparse linear algebra approaches execute efficiently. While remaining vendor agnostic, we expect to touch also architectural requirements and architectural support for such runtime technologies and workloads.
For the 2023 BOF, we expect to further continue our discussions on emerging abstractions and frameworks, representative benchmarks, and role of hardware specialization for irregular workloads, focusing on the interplay with scientific discovery and data science workflows that might execute from edge devices (e.g., experimental scientific instruments) to centralized high-performance computing systems or the cloud.
The definition of requirements and recommendations for abstractions and representative benchmarks will naturally lead the community to identify how the expertise in co-designing solutions for combinatorial and randomized graph algorithm could be translated to methodologies for supporting the emerging class of graph representation learning algorithms.
Website: https://hpc.pnl.gov/BOF