SC Technical Program Archives

Clushible: Tidal Wave-Like Configuration with Ansible

HPC Systems Professionals Workshop (HPCSYSPROS23)

HPCSYSPROS 23 – Opening Remarks

Embracing Batch on Kubernetes

Self-Service Monitoring of HPC and Openstack Jobs for Users

ICE 2.0: Restructuring and Growing an Instructional HPC Cluster

HPCSYSPROS 23 – Morning Break

MareNostrum 5: Site Report from BSC

Democratizing Remote HPC Storage Access

What a GReaT Scheduling Opportunity

Overcoming Active Directory Woes with Plain Text Caches and Replacing Passwords

Heterogeneous Syslog Analysis: There Is Hope

Report on Adaptable Open-Source Disaster Recovery Solution for Multi-Petabyte Storage Systems

HPCSYSPROS 23 – Closing Remarks

Invited Talk: Scaling Computing for Concurrent Data Structures Using Near-Memory Processing Architectures

RSDHA: Redefining Scalability for Diversely Heterogeneous Architectures

Value-Based Resource Management at SoC Scale

NVMe-Backed GNN Training on GPU Leveraging a Paged UVM Memory System

RSDHA – Morning Break

FFTX-IRIS: Toward Performance Portability and Heterogeneity for SPIRAL Generated Code

CHARM-SYCL: New Unified Programming Environment for Multiple Accelerator Types

Vertical Scaling of Variational Multiscale Modeling for Fluid Dynamics: Successes, Challenges, and Opportunities

Accelerator Integration in a Tile-Based SoC: Lessons Learned with a Hardware Floating Point Compression Engine

Evaluating Primitives in Deep Neural Network Libraries: A Case Study with the Softmax Functions

RSDHA – Panel Discussion

Third International Symposium on Quantitative Codesign of Supercomputers

Third International Symposium on Quantitative Codesign of Supercomputers

Welcome and Workshop Logistics

Co-design at system and component level: examples from the DEEP and EPI projects

Toward the Development of a Comprehensive Digital Twin of an Exascale Supercomputer

SQCS – Morning Break

Enabling Codesign in the Software Tools Ecosystem Project (STEP)

SQCS'23 – Panel

SQCS'23 – Moderated Discussion

SQCS'23 – Closing Remarks

Research Software Engineers in HPC (RSE-HPC-2023)

Research Software Engineers in HPC (RSE-HPC-2023)

RSE-HPC-2023 - Welcome and Overview

RSE-HPC-2023 – Featured Talk: UNIVERSE-HPC – Toward a Sustainable RSE Training Ecosystem

Elevating the Undergraduate Internship: Five Strategies for Putting the “R” in RSE

International RSE Collaboration with the Institute of Computing for Climate Science and the Virtual Earth System Research Institute

Years as a Trustee of SocRSE UK: A Retrospective

Life as an RSE at the University of Birmingham, UK

RSE-HPC-2023 – Morning Break

RSE-HPC-2023 – Panel: RSE Training and Mentoring

Starting at the Bottom, Now We’re Here: Building an African RSE Community

Catalyzing Research Software Engineering (RSE) Adoption in Underrepresented Regions: Harnessing the Power of Bioinformatics Communities

RSE-HPC-2023 – Breakout Discussions

RSE-HPC-2023 – Report Back from Breakouts

RSE-HPC-2023 – Wrapup

LLVM-HPC2023: The Ninth Workshop on the LLVM Compiler Infrastructure in HPC

LLVM-HPC2023: The Ninth Workshop on the LLVM Compiler Infrastructure in HPC

OpenMP Kernel Language Extensions for Performance Portable GPU Codes

DPU Offloading Programming with the OpenMP API

LLVM-HPC2023 – Morning Break

Fortran Performance Optimisation and Auto-Parallelization by Leveraging MLIR-Based Domain Specific Abstractions in Flang

Precision and Performance Analysis of C Standard Math Library Functions on GPUs

Lightning Talk - Cppless: Productive and Performant Serverless Programming in C++

Lightning Talk – Automating Loop Optimization with Code Samples and AST Matching

Lightning Talk – Just-in-Time Autotuning

Lightning Talk – META: A Toolkit for Template Metaprogramming Performance Analysis

Panel Discussion

Sustainability in HPC: Vision and Opportunities

Sustainable Supercomputing

Sustainable Supercomputing

Evaluating Total Environmental Impact for a Computing Infrastructure

Comparing Power Signatures of HPC Workloads: Machine Learning vs Simulation

Accurate Measurement of Application-Level Energy Consumption for Energy-Aware Large-Scale Simulations

Sustainable Supercomputing – Morning Break

Keys to Sustainable Leadership Supercomputing for 2025+: Location, Power, and Flexibility

Energy Efficiency of Quantum Statevector Simulation at Scale

Reducing HPC Energy Footprint for Large Scale GPU Accelerated Workloads

Emissions and Energy Efficiency on Large-Scale High Performance Computing Facilities: ARCHER2 UK National Supercomputing Service Case Study

ReAPER: Region Aware Power and Energy Regulator

Wrap-Up Discussion

HUST-23: 10th International Workshop on HPC User Support Tools

HUST-23: 10th International Workshop on HPC User Support Tools

HUST-23 Introduction

REMORA Resource Monitor: Usability, Performance, and User Interface Improvements

NPAT - A Power Analysis Tool at NERSC

Centralized Provisioning of Large Language Models for a Research Community

HUST-23 – Morning Break

ZeroSum: User Space Monitoring of Resource Utilization and Contention on Heterogeneous HPC Systems

A Fast and Responsive Web-Based Framework for Visualizing HPC Application Usage

CaRV – Accelerating Program Optimization through Capture, Replay, Validate

Introducing Open OnDemand to Supercomputer Fugaku

MSR-genie: Navigating Model Specific Registers across Processor Generations

PEAK: A Light-Weight Profiler for HPC Systems

BaRRT: Buildtime and Runtime Reproducibilty Tool for Software Development and Testing

PTI-GPU: Kernel Profiling and Assessment on Intel GPUs

HUST-23 – Conclusion

1st Workshop on Enabling Predictive Science with Optimization and Uncertainty Quantification in HPC

Efficient Probabilistic Tuning of Ensemble Forecasting Method

Keynote Speaker

Uncertainty Quantification of Reduced-Precision Time Series in Turbulent Channel Flow

EPSOUQ-HPC – Morning Break

Optimized Uncertainty Estimation for Vision Transformers: Enhancing Adversarial Robustness and Performance Using Selective Classification

Localization of Gamma-Ray Bursts in a Balloon-Borne Telescope

Automatic Search Guided Code Optimization Framework for Mixed-Precision Scientific Applications

Uncertainty Quantification of Metal Additive Manufacturing Processing Conditions Through the Use of Exascale Computing

INDIS Esteemed Guest Talk: Professor Eylem Ekici (Ohio State University)

10th Annual International Workshop on Innovating the Network for Data Intensive Science (INDIS) Final

INDIS and SCinet Introduction

Morning Break

INDIS Paper 1: Enhancing perfSONAR Measurement Capabilities Using P4 Programmable Data Planes

INDIS Paper 2: Experimental Study of TCP Throughput Profiles and Dynamics Over Dedicated Connections

INDIS Paper 3: Elephants Sharing the Highway – Studying TCP Fairness in Large Transfers Over High Throughput Links

INDIS Paper 4: Evaluation of SCION for User-Driven Path Control – A Usability Study

INDIS Paper 5: Throughput Optimization with a NUMA-Aware Runtime System for Efficient Scientific Data Streaming

Lightning Talks

13th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)

13th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)

ROSS – Welcome and Introduction

ROSS – Opening Panel: Is Accelerator Firmware the New HPC OS? Opportunities and Challenges for the OS/R Research Community

ROSS – Morning Break

RDARuntime: An OS for AI Accelerators

GPU Acceleration in Unikernels Using Cricket GPU Virtualization

CARAT KOP: Toward Protecting the Core HPC Kernel from Linux Kernel Modules

Fine-Grained Accelerator Partitioning for Machine Learning and Scientific Computing in Function as a Service Platform

Analysis and Characterization of Performance Variability for OpenMP Runtime

XLOOP 2023: The 5th Annual Workshop on Extreme-Scale Experiment-in-the-Loop Computing

XLOOP 2023: The 5th Annual Workshop on Extreme-Scale Experiment-in-the-Loop Computing

XLOOP – Introduction

Demonstrating Cross-Facility Data Processing at Scale with Laue Microdiffraction

Linking the Dynamic PicoProbe Analytical Electron-Optical Beam Line / Microscope to Supercomputers

Speeding Up Charge Exchange Recombination Spectroscopy Analysis in Support of NERSC/DIII-D Realtime Workflow

XLOOP – Morning Break

DLSIA: Deep Learning for Scientific Image Analysis

Exploring Benchmarks for Self-Driving Labs Using Color Matching

Empowering Scientific Discovery through Computing at the Advanced Photon Source

Cross-Facility Orchestration of Electrochemistry Experiments and Computations

Streaming Data from Experimental Facilities to Supercomputers for Real-Time Data Processing

Workflows Are the New Applications – So What?

DevOps Approaches for Interconnected Science Ecosystems

XLOOP – Awards Ceremony

Invited Talk: Using XDMoD for HPC Performance and Quality-of-Service Analysis

5th Workshop on Programming and Performance Visualization Tools (ProTools 2023)

Enabling Agile Analysis of I/O Performance Data with PyDarshan

ProTools 2023 – Morning Break

An Event Model for Trace-Based Performance Analysis of MPI Partitioned Point-to-Point Communication

FROOM: A Framework of Operators for OTF2 Modification

GPUscout: Locating Data Movement-Related Bottlenecks on GPUs

Filtering and Ranking of Code Regions for Parallelization via Hotspot Detection and OpenMP Overhead Analysis

Extra-Deep: Automated Empirical Performance Modeling for Distributed Deep Learning

The 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23)

The 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23) - Part 1 of 2

Welcome – Part I

Workflow Building Blocks: The Success Story of Environmental Modeling, HPC, and AI for Predicting Farmed Seafood Bacteria Contamination

End-to-End Workflows for Climate Science: Integrating HPC Simulations, Big Data Processing, and Machine Learning

WORKS23 – Afternoon Break

Accelerating Data-Intensive Seismic Research Through Parallel Workflow Optimization and Federated Cyberinfrastructure

A Systematic Mapping Study of Italian Research on Workflows

Transcriptomics Atlas Pipeline: Cloud vs HPC

Patterns and Anti-Patterns in Migrating from Legacy Workflows to Workflow Management Systems

Scale Composite BaaS Services with AFCL Workflows

Laminar: A New Serverless Stream-Based Framework with Semantic Code Search and Code Completion

Optimization Toward Efficiency and Stateful of dispel4py

Wrap Up – Part I

Invited Talk: Thoughts on Security for CXL-3.x-GFAM Clusters with Embedded Computing

2nd International Workshop on Cyber Security in High Performance Computing (S-HPC 2023)

Welcome

Distinguished Speaker

S-HPC 2023 – Afternoon Break

Invited Talk: Information Security Controls Prioritization – SABSA for HPC

Analyzing the Performance Impact of HPC Workloads with Gramine+SGX on 3rd Generation Xeon Scalable Processors

RMF for HPC and RDT&E

DeepSpeed4Science: Enabling Future Large-Scale Scientific Discovery through Sophisticated AI System Technologies

6th International Workshop on Emerging Parallel Distributed Runtime Systems and Middleware

Welcome

IPDRM’2023 – Afternoon Break

HPC Software Scaling for ML Using CXL 3.0 GFAM

Dask-Extended External Tasks for HPC/ML In Transit Workflows

Enabling Large Dynamic Neural Network Training with Learning-Based Runtime Memory Management

MPI-xCCL: A Portable MPI Library over Collective Communication Libraries for Various Accelerators

A gem5 Implementation of the Sequential Codelet Model: Reducing Overhead and Expanding the Software Memory Interface

Overcoming the Challenges to Democratizing Precision Medicine: HPC Infrastructure, Health Equity Training Sets, Training a Diverse Workforce, and Mitigating Fears

Ninth Computational Approaches for Cancer Workshop (CAFCW23)

AI/ML-Derived Whole-Genome Predictor Prospectively and Clinically Predicts Survival and Response to Treatment in Brain Cancer

CAFCW Announcements

CAFCW23 – Afternoon Break

Panel: Diversity, Equity, and Inclusion – from Data to Workforce

Deep Semi-Supervised Transfer Learning for Fully Automated Whole-Body Tumor Quantification and Prognosis of Cancer on PET/CT

Optimized Patient-Specific Catheter Placement for Convection-Enhanced Nanoparticle Delivery in Recurrent Glioblastoma

Environmental Factors and Lung Cancer: A Predictive Spatial Approach

Constructing a Large-Scale Biomedical Knowledge Graph and Its Applications in Drug Discovery

Scalable Lead Prediction with Transformers Using HPC Resources

Entropy-Based Regularization on Deep Learning Models for Anti-Cancer Drug Response Prediction

The 9th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-9)

The 9th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-9)

Opening Remarks and Welcome

Invited Talk: Challenges and Opportunities for Data Democratization

LibPressio-Predict: Flexible and Fast Infrastructure For Inferring Compression Performance

What Operations Can Be Performed Directly on Compressed Arrays and with What Error?

Analyzing Impact of Data Reduction Techniques on Visualization for AMR Applications Using AMReX Framework

Fast 2D Bicephalous Convolutional Autoencoder for Compressing 3D Time Projection Chamber Data

Streaming Hardware Compressor Generator Framework

Lossy and Lossless Compression for BioFilm Optical Coherence Tomography (OCT)

D-HPC Opening: “LLMs and Democratizing HPC”

The First Workshop on Democratizing High-Performance Computing (D-HPC)

The History and Future of Making HPC Technologies Accessible to the wider community

Democratizing HPC by Building a Diverse and Inclusive Workforce

D-HPC – Afternoon Break

Democratizing HPC Access and Use with Knowledge Graphs

Democratizing Science Through Equitable Access to Computing and Data

S4PST: Stewardship of Programming Systems and Tools

D-HPC: Closing Remarks

The 1st International Workshop on the Environmental Sustainability of High-Performance Software

Energy Consumption Comparison of Parallel Linear Systems Solver Algorithms on HPC Infrastructure

Domain-Specific Energy Modeling for Drug Discovery and Magnetohydrodynamics Applications

An End-to-End HPC Framework for Dynamic Power Objectives

Automatic Energy-Efficient Job Scheduling in HPC: A Novel SLURM Plugin Approach

PM100: A Job Power Consumption Dataset of a Large-Scale Production HPC System

Augmenting ML-Based Predictive Modelling with NLP to Forecast a Job's Power Consumption

Closing Remarks and Best Paper

FTXS 2023 : Invited Speaker (Paolo Rech, "Quantum Computing Reliability: Problems, Tools, and Potential Solutions")

13th Workshop on Fault-Tolerance for HPC at Extreme Scale (FTXS 2023)

FTXS 2023 – Opening Remarks

FTXS 2023 – Afternoon Break

Optimizing Write Performance for Checkpointing to Parallel File Systems Using LSM-Trees

Recovery from Silent Data Corruption via Spatial Data Prediction

Disk Failure Trends in Alpine Storage System

Using Benford's Law to Identify Unusual Failure Regions

Dynamic Selective Protection of Sparse Iterative Solvers via ML Prediction of Soft Error Impacts

Evaluating the Resiliency of Posits for Scientific Computing

When to Checkpoint at the End of a Fixed-Length Reservation?

FTXS 2023 – Closing Remarks

Toward Standardized, Open Object-Based Computational Storage For Large-Scale Scientific Data Analytics

PDSW23: 8th International Parallel Data Systems Workshop

Welcome and Opening Remarks

Invited Talk

DAOS as HPC Storage: Exploring Interfaces

Toward a Peer-to-Peer Data Distribution Layer for Efficient and Collaborative Resource Optimization of Distributed Dataflow Applications

PDSW – Afternoon Break

Enhancing Metadata Transfer Efficiency: Unlocking the Potential of DAOS in the ADIOS Context

IOMax: Maximizing Out-of-Core I/O Analysis Performance on HPC Systems

Domain-Aware Performant AI-Based Compression

The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC

GrIOt: Graph-Based Modeling of HPC Application I/O Call Stacks for Predictive Prefetch

Advancing Automated I/O Analysis with Multi-Perspective Views

PoliMOR: A Policy Engine "Made-to-Order" for Automated and Scalable Data Management in Lustre

Accelerate Stage-Out in Single Shared Files from Node-Local Burst-Buffers

DAOS Project Update

Compression of Scientific Simulation Data by Stochastic Basis Expansion – Example on Multiple Computer Systems

Keynote: Design of Efficient and Privacy Preserving Machine Learning

Workshop on Software and Hardware Co-Design of Deep Learning Systems in Accelerators (SHDA)

Welcome to SC Workshop SHDA 2023

SHDA – Afternoon Break

Accuracy-Constrained Efficiency Optimization and GPU Profiling of CNN Inference for Detecting Drainage Crossing Locations

Invited Talk: I/O Profiling and Benchmarking for AI Applications

Benchmarking and In-Depth Performance Study of Large Language Models on Habana Gaudi Processors

Invited Talk: When Optimizing Software produces Optimized Hardware: A Case for Statically-Interpretable Control-Flow Programs

Pareto Optimization of CNN Models via Hardware-Aware Neural Architecture Search for Drainage Crossing Classification on Resource-Limited Devices

Accelerating Hyperparameter Optimization Algorithms with Mixed Precision

Workshop SHDA23 Wrap-Up

Correctness Workshop Opening Remarks

7th International Workshop on Software Correctness for HPC Applications (Correctness '23)

HPC Bugs Fest Introduction

Mapping High-Level Concurrency from OpenMP and MPI to ThreadSanitizer Fibers

Rethinking Data Race Detection in MPI-RMA Programs

Correctness '23 – Afternoon Break

RMARaceBench: A Microbenchmark Suite to Evaluate Race Detection Tools for RMA Programs

Data Race Detection Using Large Language Models

Mixed-Precision S/DGEMM Using the TF32 and TF64 Frameworks on Low-Precision AI Tensor Cores

Toward Correctness Checking of MPI Partitioned Communication in MUST

Adding Microbenchmarks with SIMD Data Race to DataRaceBench

Investigating the Real-World Applicability of MPI Correctness Benchmarks

Improve and Stabilize Classification Results of DataRaceBench

Highlighting PARCOACH Improvements on MBI

AI-Augmented SWARM Based Resilience for Integrate Research Infrastructures

Fourth International Symposium on Checkpointing for Supercomputing (SuperCheck-SC23)

Welcome to SuperCheck-SC23

Lightning Talk: Diaspora – Resilient Event Processing for Irregular, Distributed Scientific Applications

SuperCheck-SC23 – Afternoon Break

Checkpoint/Restart for CUDA Kernels

Implementation-Oblivious Transparent Checkpoint-Restart for MPI

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics

Lightning Talk: Update on Checkpointing and Localized Recovery for Nested Fork-Join Programs

Lightning Talk: Toward Efficient Asynchronous Checkpointing for Large-Language Models

Lightning Talk: Inherent Checkpointing Properties of Nested Parallelism

Lightning Talk: Trade-Offs For Developing File Aggregated I/O For Asynchronous Checkpointing

Lightning Talk: Datastates for Debugging – Using Productive Checkpointing for Improved Debugging

A New Sparse GEneral Matrix-Matrix Multiplication Method for Long Vector Architecture by Hierarchical Row Merging

IA^3 2023 - 13th Workshop on Irregular Applications: Architectures & Algorithms

IA^3 – Welcome and Introduction

IA^3 – Invited Talk

IA^3 2023 – Afternoon Break

Towards a Massive-Scale Distributed Neighborhood Graph Construction

A Parallel Algorithm for Updating a Multi-Objective Shortest Path in Large Dynamic Networks

cuAlign: Scalable Network Alignment on GPU Accelerators

TANGO: A GPU-Optimized Traceback Approach for Sequence Alignment Algorithms

Filtering Wasteful Vertex Visits in Breadth-First Search

Accelerating Deep Neural Network Guided MCTS Using Adaptive Parallelism

IA^3 – Concluding Remarks

ISAV23 Invited Keynote – Progress in In-Situ Analysis and Visualization in the Fusion Exascale Code XGC

ISAV23: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization

ISAV23 – Introduction

Information Entropy-Based Camera Focus Point and Zoom Level Adjustment for Smart In-Situ Visualization

Toward a Scalable In Situ Fast Fourier Transform

Enabling In Situ Visualization of Large-Scale Cellular Simulations

ISAV23 – Morning Break

Extensions to the SENSEI In situ Framework for Heterogeneous Architectures

A General Purpose Interface for Interactive Computational Steering Instrumentation Using Ascent

Design of a Framework for Combined Flexible and Efficient Simulation and In Situ Processing

Trigger Smart Data Saving Applied to CO2 Capture in Metal-Organic Frameworks

Unraveling Diffusion in Fusion Plasma: A Case Study of In Situ Processing and Particle Sorting

Using Umpire In-Situ for Improved Memory Performance

State of In Situ Visualization in Simulations: We are fast. But are we inspiring?

Scaling Computational Fluid Dynamics: In Situ Visualization of NekRS using SENSEI

ISAV23 – Best Paper Award and Closing Remarks

Tenth Workshop on Accelerator Programming Using Directives (WACCPD 2023)

Tenth Workshop on Accelerator Programming and Directives (WACCPD 2023)

Porting and Optimizing Meso-NH to AMD MI250X GPUs

Comparing a Naive and a Tree-Based N-Body Algorithm Using Different Standard SYCL Implementations on Various Hardware

Specialized Kernels for Optimizing GPU Offload in OpenMP

WACCPD 2023 – Morning Break

Invited Talk

Performance-Portable GPU Acceleration of the EFIT Tokamak Plasma Equilibrium Reconstruction Code

Characterizing the Performance of Triangle Counting on Graphcore's IPU Architecture

Memory Transfer Decomposition: Exploring Smart Data Movement through Architecture-Aware Strategies

Analysis of MURaM – A Solar Physics Application, for Scalability, Performance, and Portability

Tenth Workshop on Accelerator Programming and Directives (WACCPD2023) – Closing Remarks

AI Assisted Software Development for HPC (AI4DEV)

AI-Driven Performance Metaprogramming

AI4DEV – Morning Break

MPI-RICAL: Data-Driven MPI Distributed Parallelism Assistance with Transformers

VSCuda: LLM-Based CUDA Extension for Visual Studio Code

LLVM in the age of LLMs: Machine Learning for IR and optimization and more

Unlocking the Potential of Large Language Models for High-Performance Computing Code

A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

2023 International Workshop on Performance, Portability, and Productivity in HPC (P3HPC)

P3HPC – Welcome and Introduction

Performance Evaluation of Heterogeneous GPU Programming Frameworks for Hemodynamic Simulations

Performance Portability Evaluation of Blocked Stencil Computations on GPUs

P3HPC – Morning Break

Benchmarking a Portable Lattice Quantum Chromodynamics Kernel Written in Kokkos and MPI

MatRIS: Multilevel Math Library Abstraction for Heterogeneity and Performance Portability Using IRIS Runtime

Porting Batched Iterative Solvers onto Intel GPUs with SYCL

Evaluating the Performance of One-Sided Communication on CPUs and GPUs

Performance Portability of Programming Strategies for Nearest-Neighbor Communication with GPU-Aware MPI

Evaluating the Performance Portability of SYCL across CPUs and GPUs on Bandwidth-Bound Applications

CuPBoP-AMD: Extending CUDA to AMD Platforms

High-Level GPU Code: A Case Study Examining JAX and OpenMP

Many Cores, Many Models: GPU Programming Model vs. Vendor Compatibility Overview

P3HPC – Wrapup

CANOPIE-HPC– Introduction and Welcome

5th International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC)

Canopie-HPC

Survey of Adaptive Containerization Architectures for HPC

HPC Container Conformance

Kubeflow-as-a-Service on HPC clusters – First Experiences

Preemptive Scheduling of Stateful GPU-Intensive HPC Applications in Kubernetes

Enabling Performance for NGC Containers on the Slingshot 11 Interconnect

Lightweight Isolation for HPC Applications

Canopie-HPC – Morning Break

Charliecloud’s Layer-Free, Git-Based Container Build Cache

New Root Emulation Mode for Charliecloud Using seccomp

eBPF-Based Performance Fingerprint of Containerized HPC Applications

Understanding Energy Performance of Containers Deployment on HPC-Based Post-Moore Platforms

Perspectives and Experiences Supporting Containers for Research Computing at the Texas Advanced Computing Center

Early Experiences with Charliecloud for HPC

Computing-as-a-Service Infrastructure for Accelerating Digital Engineering

The Story of Spin: Five Years Supporting Science with Container-Based Services at NERSC

CANOPIE-HPC Community Discussion/Open Q&A

The 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23)

The 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23) - Part 2 of 2

Welcome - Part II

FAIRIST of Them All: Meeting Researchers Where They Are With Just-in-Time, FAIR Implementation Advice

A data science pipeline synchronisation method for edge-fog-cloud continuum

WORKS23 – Morning Break

TaskVine: Managing In-Cluster Storage for High-Throughput Data Intensive Workflows

Leveraging Large Language Models to Build and Execute Computational Workflows

Delivering Rules-Based Workflows for Science

Julia as a Unifying End-to-End Workflow Language on the Frontier Exascale System

Scaling on Frontier: Uncertainty Quantification Workflow Applications Using ExaWorks to Enable Full System Utilization

Distributed Data Locality-Aware Job Allocation

Fluxion: A Scalable Graph-Based Resource Model for HPC Scheduling Challenges

The Common Workflow Scheduler Interface: Status Quo and Future Plans

Wrap Up – Part II

A Comparison of Mesh-Free Differentiable Programming and Data-Driven Strategies for Optimal Control under PDE Constraints

Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)

AI4S – Keynote

AI4S – Morning Break

AI4S – Invited Talk

Toward Foundation Models for Materials Science: The Open MatSci ML Toolkit

Protein Generation via Genome-Scale Language Models with Bio-Physical Scoring

AI4S – Lunch Break

Accelerating Particle and Fluid Simulations with Differentiable and Interpretable Graph Networks for Solving Forward and Inverse Problems

Enabling Performant Thermal Conductivity Modeling with DeePMD and LAMMPS on CPUs

Machine Learning Applied to Single-Molecule Activity Prediction

AI4S – Afternoon Break

Tournament-Based Pretraining to Accelerate Federated Learning

Elastic Deep Learning through Resilient Collective Operations

Toward Rapid Autonomous Electron Microscopy with Active Meta-Learning

Autotuning Apache TVM-Based Scientific Applications Using Bayesian Optimization

Enhancing Heterogeneous Federated Learning with Knowledge Extraction and Multi-Model Fusion

Entropy-Driven Optimal Sub-Sampling of Fluid Dynamics for Developing Machine-Learned Surrogates

Tencoder: Tensor-Product Encoder-Decoder Architecture for Predicting Solutions of PDEs with Variable Boundary Data

PMBS23: The 14th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

PMBS23: The 14th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

PMBS23 – Welcome

Physical Oscillator Model for Supercomputing

Comparative Evaluation of Bandwidth-Bound Applications on the Intel Xeon CPU MAX Series

PMBS23 – Morning Break

SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study

Reducing Memory Requirements for the IPU Using Butterfly Factorizations

Verifying Performance Guidelines for MPI Collectives at Scale

A Performance Model for Estimating the Cost of Scaling to Practical Quantum Advantage

Hardware Specialization: Estimating Monte Carlo Cross-Section Lookup Kernel Performance and Area

PMBS23 – Lunch Break

Power Analysis of NERSC Production Workloads

Adaptive Stopping Rule for Performance Measurements

Latency and Bandwidth Microbenchmarks of US Department of Energy Systems in the June 2023 Top 500 List

PMBS23 – Afternoon Break

Risk-Aware Scheduling Algorithms for Variable Capacity Resources

A Reinforcement Learning-Based Backfilling Strategy for HPC Batch Jobs

Evaluating the Potential of Elastic Jobs in HPC Systems

Modeling Data Locality of Sparse Matrix-Vector Multiplication on the A64FX

WHPC@SC23: 16th International Women in HPC Workshop

WHPC@SC23: 16th International Women in HPC Workshop

WHPC@SC23 – Introduction

WHPC@SC23 – Invited Speaker: When to Jump – Managing Your Career and Maximizing Your Impact

WHPC@SC23 – Morning Break

WHPC@SC23: Surviving and Thriving as an ‘Outsider' – with the Help of Allies

WHPC@SC23 – Strength in Unity: Fostering Tech Career Persistence

WHPC@SC23 – WHPC Lyceum

WHPC@SC23 – Lunch Break

Investigating Linear Solvers for Power Grid Analysis with Exascale Computing: A Journey of Learning and Collaboration

Potential of Cryogenics Electronics for Future Computing Systems

fAsyLex: Accelerating Legal NLP through Comparative Analysis of Multi-GPU Approaches

An Analysis of Change Point Detection in High Performance Computing

Scalable Graph Analytics and HPC Operational Enhancement: Parallel Computing and ML/DL Innovations

OpenGPT-X: Advancements, Challenges, Exploration, and Future Goals

Accelerating the HPC I/O for Low Latency and High Throughput with 16-Nanometer FPGA-Based Hardware Accelerators

Exploring the Potential of GPU-initiated Communications in HPC Applications

Simulating Quantum Chemistry on Heterogeneous Architectures

Operationalizing HPC Tasks for Space Weather Forecasting Using Celery and Django: Making Automated, HPC-Powered Scientific Results Accessible in Near-Real Time.

Queue Wait Time Prediction in Supercomputers

Spatiotemporal Analysis and Prediction of Laboratory-Generated Turbulence

WHPC@SC23 – Networking Breakout

WHPC@SC23 – Afternoon Break

Fostering Diversity, Equity, and Inclusion (DEI) at Big Tech Firms

WHPC@SC23

WHPC@SC23 – Leading from the Middle

Our Success Case of Full Remote Working

WHPC@SC23 – Conclusion

Fourth International Workshop on Quantum Computing Software

Fourth International Workshop on Quantum Computing Software

Fast Simulation of High-Depth QAOA Circuits

Prototype of a Batched Quantum Circuit Simulator for the Vector Engine

Enabling Quantum Computer Simulations on AMD GPUs: A HIP Backend for Google's qsim

Quantum Computing Software – Morning Break

Enabling Scalable VQE Simulation on Leading HPC Systems

MEMQSim: Highly Memory-Efficient and Modularized Quantum State-Vector Simulation

BGLS: A Python Package for the Gate-by-Gate Sampling Algorithm to Simulate Quantum Circuits

TISCC: A Surface Code Compiler and Resource Estimator for Trapped-Ion Processors

JuliQAOA: Fast, Flexible QAOA Simulation

SimuQ: A Domain-Specific Language for Quantum Simulation with Analog Compilation

Quantum Computing Software – Lunch Break

Using Azure Quantum Resource Estimator to Evaluate Performance of Quantum Algorithms

Making QIR Executable

QASMTrans: A QASM Quantum Transpiler Framework for NISQ Devices

Quantum Computing Software – Afternoon Break

QArchSearch: A Scalable Quantum Architecture Search Package

Towards an Expressive Python-Native Interface for Quantum Program Development

An Ising-Based Model for Qubit Mapping

A Reference Implementation for a Quantum Message Passing Interface

Distributing Circuits Over Heterogeneous, Modular Quantum Computing Network Architectures

Open Q&A Session

The 6th Annual Parallel Applications Workshop, Alternatives to MPI+X (PAW-ATM)

The 6th Annual Parallel Applications Workshop, Alternatives to MPI+X (PAW-ATM)

Introduction to The 6th Annual Parallel Applications Workshop, Alternatives to MPI+X

Survey of Technologies for Developers of Parallel Applications: SHMEM

Survey of Technologies for Developers of Parallel Applications: Swift/T

Survey of Technologies for Developers of Parallel Applications: Julia

Survey of Technologies for Developers of Parallel Applications: Legate and cuNumeric

Survey of Technologies for Developers of Parallel Applications: Q&A

PAW-ATM – Morning Break

Implementing Scalable Matrix-Vector Products for the Exact Diagonalization Methods in Quantum Many-Body Physics

High-Performance Programming and Execution of a Coral Biodiversity Mapping Algorithm Using Chapel

Design and Analysis of the Network Software Stack of an Asynchronous Many-Task System – The LCI Parcelport of HPX

shmem4py: High-Performance One-Sided Communication for Python Applications

Pure: Evolving Message Passing To Better Leverage Shared Memory within Nodes

PAW-ATM – Lunch Break

PAW-ATM Distinguished Speaker: Ethan Gutmann – National Center for Atmospheric Research: Trials and Tribulations and Joys of Developing with Alternative Parallel Frameworks

PAW-ATM – Afternoon Break

symPACK: A GPU-Capable Fan-Out Sparse Cholesky Solver

PAW-ATM Panel Discussion: Charting Paths to Success with Alternatives to MPI+X

Featured Talk: Aurora Exascale Architecture

ESPM2 2023: Eighth International Workshop on Extreme Scale Programming Models and Middleware

ESPM2 2023 – Morning Break

Challenge on Extreme-Hetero Application Programming

The MI300 APU: Programming for CPUs and GPUs on a Single Package

Cross-Stack System Techniques for Trillion-Parameter Scale Model Inference

Performance Portability in the Age of Extreme Heterogeneity

ESPM2 2023 – Lunch Break

Programming Model for Habana/Gaudi2 Accelerators and Its Impact on Deep Learning Inference/Training Performance at Scale

An Autonomous Execution Model for GPUs: When CPUs Take a Back Seat

ESPM2 – Afternoon Break

Who's Winning the Performance Portability Race on GPU Platforms?

Domain-Specific Programming Methodologies for Domain-Specific and Emerging Computing Systems

Top 5 Challenges  in Programming Models and Runtimes for Large Language Models Training/Inference

Invited Talk 1: The Legacy of ECP Software Efforts, Realized, and to Come

14th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH'23)

Welcome

ScalAH'23 – Morning Break

Invited Talk 2: Living in a Heterogenous World – How Scientific Workflows Bridge Diverse Cyberinfrastructure and What Can We Do Better?

GPU-Based LU Factorization and Solve on Batches of Matrices with Band Structure

Massively Distributed Finite-Volume Flux Computation

Parallel Symbolic Cholesky Factorization

Advancing the Distributed Multi-GPU ChASE Library through Algorithm Optimization and NCCL Library

ScalAH'23 – Lunch Break

Invited Talk 3: The Pursuit of the Brain’s Ubiquitous Stochasticity

Optimization of Ported CFD Kernels on Intel Data Center GPU Max 1550 Using oneAPI ESIMD

ScalAH'23 – Afternoon Break

Invited Talk 4: Innovative Supercomputing by Integrations of Simulations/Data/Learning on Oakforest-PACS II

Invited Talk 5: Building Quantum Machine Learning for Real-World Applications

Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators

Moment Representation of Regularized Lattice Boltzmann Methods on NVIDIA and AMD GPUs

EduHPC-23: Workshop on Education for High Performance Computing

EduHPC-23: Workshop on Education for High Performance Computing

EduHCP23 – Welcome Remarks

EduHPC23 – Invited Talk by Kathy Yelick: Educating Post Exascale HPC Leaders

Teaching Heterogeneous and Parallel Computing with Google Colab and Raspberry Pi Clusters

Infrastructure for Writing Fork-Join Tests

Data-Driven Discovery of Anchor Points for PDC Content

AutoLearn: Learning in the Edge to Cloud Continuum

EduHPC23: Panel Q&A Paper Session I

EduHPC-23 – Afternoon Break

Next Generation Pathways to Computing: Bridging the Diversity Gap in High-Performance Computing Education

Training Experiences by Skills for HPC Ecosystems

Teaching Non-Determinism in High Performance Applications

ML Movie Night: A Pilot Machine Learning Course for High-School Students and Implications for Undergraduate Adaptation

The World's Worst Optical NIC

Composable HPC Curricula: Embracing the UNIX Development Paradigm and Leveraging Core Practices from Linux Kernel Development in HPC Training Marterial Development

Adding Sustainability to Parallel Programming Assignments

EduHPC23 – Panel Q&A: Lightning Talks

The Wide Area Classroom: 24,000 HPC Students and Growing

Faculty Development Workshops for Integrating PDC in Early Undergraduate Curricula: An Experience Report

An NSF REU Site Based on Trust and Reproducibility of Intelligent Computation: Experience Report

Performance Engineering for Graduate Students: a View from Amsterdam

EduHPC23 – Panel Q&A: Paper Session II

1D Heat Equation in Chapel

Program Your Favorite Data Science Pipeline in Spark

Parallelizing a 1-Dim Nagel-Schreckenberg Traffic Model

Using MPI For Distributed Hyper-Parameter Optimization and Uncertainty Evaluation

k-Nearest Neighboor with Map Reduce MPI

K-means Clustering: An Assignment for OpenMP, MPI, and CUDA/OpenCL

EduHPC23 – Panel Q&A: Peachy Assignments

CDER Announcements and Closing

Lightning Vendor Talk: Esperanto Technologies ET-SoC for AI and ML Workloads

Second International Workshop on RISC-V for HPC

Introduction and Welcome

RISC-V Everywhere

Lightning Vendor Talk: The InspireSemi next gen Thunderbird compute accelerator for HPC, AI, and graph analytics

Lightning Vendor Talk: SG2042 Empowering RISC-V in High-Performance Computing

Lightning Vendor Talk: E4 Experience with RISC-V in HPC

RISC-V for HPC – Afternoon Break

An Empirical Comparison of the RISC-V and AArch64 Instruction Sets

Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger

Is RISC-V Ready for HPC Prime-Time: Evaluating the 64-Core Sophon SG2042 RISC-V CPU

Short Reasons for Long Vectors in HPC CPUs: A Study Based on RISC-V

Automatic Generation of Micro-Kernels for Performance Portability of Matrix Multiplication on RISC-V Vector Processors

Challenges and Opportunities in the Co-Design of Convolutions and RISC-V Vector Processors

Second International Workshop on RISC-V for HPC

Workshop on Memory Technologies, Systems, and Applications

Workshop on Memory Technologies, Systems, and Applications

Keynote

MTSA – Afternoon Break

Accelerating In Situ Analysis Using Non-volatile Memory

CXL Memory as Persistent Memory for Disaggregated HPC: A Practical Approach

GPU Graph Processing on CXL-Based Microsecond-Latency External Memory

Dynamic Memory Provisioning on Disaggregated HPC Systems

ExaMPI: Workshop on Exascale MPI

Distinguished Speaker: GPU Centric Communication – Is MPI Missing Out?

ExaMPI – Afternoon Break

Optimizing Irregular Communication with Neighborhood Collectives and Locality-Aware Parallelism

Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures

Embedding Rust within Open MPI

A Statistical Analysis of HPC Network Tuning

OpenSHMEM Queues: An Abstraction for Enhancing Message Rate, Bandwidth Utilization, and Reducing Tail Latency in OpenSHMEM Applications

Efficient Data Redistribution for Malleable Applications

ExaMPI: Workshop on Exascale MPI

Tenth SC Workshop on Best Practices for HPC Training and Education

Tenth SC Workshop on Best Practices for HPC Training and Education

Emerging Technologies and HPC Education, Outreach, and Training

Expanding Horizons: Advancing HPC Education in Colombia through CyberColombia's Summer Schools

The BEAST LAB: A Practical Course on Experimental Evaluation of Diverse Modern HPC Architectures and Accelerators

Using Unity for Scientific Visualization as a Course-Based Undergraduate Research Experience

Best Practices for HPC Training and Education – Afternoon Break

Scaling HPC Education

Intro to HPC Bootcamp: Engaging New Communities through Energy Justice Projects

Data Analytics Program in Community Colleges in Preparation for STEM and HPC Careers

The Code-a-Thon, Improving Student Engagement through Community Coding

Let’s Get Our Heads Out of the Clouds (A Scalable and Sustainable Approach to HPC Training Labs for Resource Constrained Environments and Anyone Else Stuck in the Clouds)

Q&A and Discussion

Bridging the Quantum Gap: Addressing Challenges in Training Individuals in Quantum Computing Using Self-Guided Learning Resources

HPC Carpentry – A Scalable, Peer-Reviewed Training Pprogram to Democratize HPC Access

Understanding Community Perspectives on HPC Skills and Training Pathways

Cross-Institutional Research Engagement Network (CIREN): Initial Project Goals and Objectives in Support of Training, Mentoring, and Research Facilitation

Exascale and Beyond – Required Competences for the Computational Scientists

Q&A and Discussion

Welcome Machine Learning with Graphs in High Performance Computing Environment

Workshop on Machine Learning with Graphs in High Performance Computing Environments

Invited talk: Practical Machine Learning on Biological Knowledge Graphs

MLG-HPCE – Afternoon Break

Addressing Stale Gradients in Scalable Federated Deep Reinforcement Learning

An Efficient Distributed Graph Engine for Deep Learning on Graphs

HPC-GPT: Integrating Large Language Model for High-Performance Computing

DDStore: Distributed Data Store for Scalable Training of Graph Neural Networks on Large Atomistic Modeling Datasets

An Analysis of Graph Neural Network Memory Access Patterns

Digital Twins: Practices and Principles for High Performance Computing

Digital Twins: Practices and Principles for High Performance Computing

Consider studying all-pairs shortest paths

Future Is Sparse: Methods and Tools for Sparse Computations

Welcome & Introduction by SparCity

Dynamic Data Structures on the GPU

Coffee Break

Tensor cores for matrix multiplication are on the rise - is there any hope for sparse operations?

The Future of Machine Learning is Sparse

Future is Sparse Panel

Ninth International Workshop on Heterogeneous High-Performance Reconfigurable Computing (H2RC 2023)

Chameleon: A Disaggregated CPU, GPU, and FPGA System for Retrieval-Augmented Language Models

Invited Talk

Enabling Communication with FPGA-Based Network-Attached Accelerators for HPC Workloads

H2RC'23 – Morning Break

Tydi-lang: A Language for Typed Streaming Hardware

Altis-SYCL: Migrating Altis Benchmarking Suite from CUDA to SYCL for GPUs and FPGAs

OctoRay: Framework for Scalable FPGA Cluster Acceleration of Python Big Data Applications

Stencil-HMLS: A Multi-Layered Approach to the Automatic Optimization of Stencil Codes on FPGA

Fourth Workshop on Heterogeneous Memory Systems (HMEM)

Fourth Workshop on Heterogeneous Memory Systems (HMEM)

HMEM – Welcome

Keynote: Empowering Large AI Models Based on Heterogeneous Memory

Persistent Snapshot Isolation with Unlimited Reads on Commodity Hardware Transactional Memory

HMEM – Morning Break

DAOS Beyond Persistent Memory: Architecture and Initial Performance Results

CachedArrays: API and Framework to Optimize Data Movement for Heterogeneous Memory Systems

Evaluating the Latest Optane Memory: A Glorious Swansong?

3rd International Workshop on RESource DISaggregation in High Performance Computing (RESDIS)

Sunfish: An Open Centralized Composable HPC Management Framework

Keynote

Morning Break

RISA: Round-Robin Intra-Rack Friendly Scheduling Algorithm for Disaggregated Datacenters

Resource Disaggregation in Practice – Industry Session

Panel Discussion

First International Workshop on HPC Testing and Evaluation of Systems, Tools, and Software (HPCTESTS 2023)

First International Workshop on HPC Testing and Evaluation of Systems, Tools, and Software (HPCTESTS 2023)

Experiences Detecting Defective Hardware in Exascale Supercomputers

Keynote

Principles for Automated and Reproducible Benchmarking

HPCTESTS 2023 – Morning Break

Ramble: A Flexible, Extensible, and Composable Experimentation Framework

Toward Collaborative Continuous Benchmarking for HPC

Perspectives and Discussion

The Second Workshop on Federated and Privacy Preserving AI for HPC

Divide and Conquer: Scaling AI via Federated Learning and Distributed Alternatives to Backpropagation

Swarm Learning – Privacy Preserving Decentralized Machine Learning

Behind Closed Doors: Exploring Privacy Vulnerabilities in Federated Learning

Morning Break

Toward a Secure Federated Infrastructure for AI-Accelerated Multi-Facility Science

Federated Learning with Healthcare Data at Scale

Panel Discussion