Skip to main content

Revolutionizing Earth System Modeling Through Exascale Computing

As we build up to SC23 in November, we continue to highlight the work and contributions of “I Am HPC” heroes and organizations to HPC and the SC Conference itself. In this post, David C. Bader, the Principal Investigator, and Mark A. Taylor, the Chief Computational Scientist for the Energy Exascale Earth System Model (E3SM) project, share insights about the E3SM project and the challenges it’s had to overcome in the jump to exascale model development.

The E3SM project, which began around 10 years ago, is a high-resolution climate modeling initiative aimed at developing state-of-the-art climate system models for use on exascale computers, as envisioned by the Department of Energy (DOE). The main objectives of the E3SM project are to advance climate modeling capabilities to address diverse applications in support of DOE mission requirements. The project includes over 100 scientists and software engineers at multiple DOE laboratories as well as several universities.

In February 2023, the E3SM project had an early opportunity to run its Simple Cloud Resolving E3SM Atmosphere Model version 1 (SCREAMv1) on Frontier, the first U.S. exascale computer, making it the first cloud-resolving model to run on an exascale computer on GPUs. The team achieved unprecedented results, with the atmospheric component operating at a rate exceeding 1 simulated-year-per-day (SYPD). It accomplished this remarkable feat using 8192 Frontier nodes, each equipped with 4 AMD MI250 GPUs.

Note that this Q&A interview with David and Mark has been edited for clarity and length.

Sharing E3SM Insights

David C. Bader

David leads the climate research program for Lawrence Livermore National Laboratory, a Department of Energy research institution. He has been an active researcher and research manager in meteorology and climate science, focusing on the intersection of science with high-performance computational/information technologies for 35 years. Bader currently is the lead Principal Investigator for the Energy Exascale Earth System Model, a collaboration of 8 DOE National Laboratories and multiple universities.

Mark A. Taylor

Mark is a mathematician who specializes in numerical methods for parallel computing and geophysical flows. He currently serves as Chief Computational Scientist for the DOE’s Energy Exascale Earth System Model (E3SM) project. He developed the Hamiltonian structure preserving formulation of the spectral element method used in E3SM’s atmospheric component model.

Q: How does E3SM simulate Earth’s water cycle, biogeochemistry, and cryosphere?

David: The original model has four main components: an atmospheric general circulation model (akin to weather prediction models), an ocean circulation model, a sea ice model for icy regions, and a land model, which includes sub-models for elements like vegetation. Sometimes, rivers are considered a separate component, connecting land and ocean. For climate change analysis, it’s essential to have a land ice component to account for melting glaciers in places like Greenland and Antarctica that contribute to rising sea levels.

What sets E3SM apart is its coupling with a multi-sector dynamics model, previously known as an integrated assessment model. This model connects climate change to economic and energy systems, integrating a human component into the analysis. It enables us to evaluate climate change’s impact on energy consumption, energy production source choices, and more. We’re currently in the early stages of implementing this integrated approach.

Mark: I think it’s also worth mentioning the flux coupler. This component connects all the models by facilitating communication between them. For instance, when precipitation occurs in the atmosphere, the coupler transfers this data to the ocean model, ensuring that each component accurately reflects the current state of the climate system.

I would also just add that our model, like most climate models, simulates Earth’s atmosphere using differential equations. When we know the differential equations governing aspects such as atmospheric motion, the model solves those equations. For other aspects where the equations are unknown, we approximate them using parameterizations.

Q: What simulations does E3SM conduct to address key Earth system science questions?

David: Like most modeling groups, establishing credibility is crucial. Every few years, the Coupled Model Intercomparison Project (CMIP) conducts a series of climate change simulations that are featured in IPCC [Intragovernmental Panel on Climate Change] reports. These simulations involve the input of various scenarios, including future emissions of carbon dioxide, aerosols, and other atmospheric constituents, to project the future climate. Our model, along with other major modeling groups, participates in these simulations.

One feature of our model that’s not unique but also not common is the use of regionally refined areas, particularly over North America. This allows us to obtain a more detailed view of climate change in specific regions of interest within the US or elsewhere in the world. Most models run with a horizontal grid spacing of around 100 kilometers in both the atmosphere and ocean. However, our model can refine this grid spacing down to 10 kilometers in both the atmosphere and ocean, without having to apply high refinement everywhere. This offers a more focused approach to studying regional climate change impacts.

With the latest exascale model, we’ve even achieved a resolution of three kilometers, which is capable of resolving individual storms. The simulations generated by this model resemble actual weather events, providing a more detailed and accurate representation of the Earth’s climate system. Previously, high-resolution models typically had a horizontal grid spacing of around 25 kilometers; three kilometers is a resolution better than most of today’s weather prediction models.

Mark: A three-kilometer resolution is a goal for many climate modeling centers, as it enables storm and cloud-resolving scales. However, achieving this level of detail requires significant computational resources.

Q: How does E3SM incorporate human influences on Earth’s systems?

David: The most significant and obvious human impact on climate change is the emission of greenhouse gasses, primarily carbon dioxide and methane, resulting from the consumption of fossil fuels for our global energy needs. Human activities also produce atmospheric aerosols, which can have a temporary cooling effect but ultimately mask the actual warming that is occurring. So as we clean up air pollution, we may inadvertently exacerbate climate change.

Another way humans affect the climate is by altering land use. Converting forests and rainforests into farmland, rerouting rivers, and constructing dams can change the way land interacts with the atmosphere, influencing long-term climate patterns. Understanding past, present, and future land use changes can help us predict their impact on regional and global climate.

Mark: The CMIP, which the IPCC relies on for many of its reports, includes scenarios for things like future CO2 concentrations. For example, the highest expected levels, or scenarios where the world makes a heroic effort to reduce CO2. So there are often different CO2 levels prescribed in, say, a forest model for instance. For some of the other things Dave mentioned, we try to model and predict rather than prescribe. 

Q: How are exascale computing architectures impacting atmospheric modeling?

Mark: The DOE recently delivered the world’s first exascale computer, Frontier, and we finally got our code for a Simple Cloud Resolving Atmosphere Model running on Frontier’s advanced architecture. The exascale computing capacity enables us to run very high-resolution cloud-resolving models, allowing us to simulate the motions in the atmosphere responsible for cloud formation, rather than approximating them. 

The DOE has been working on building the US exascale computers for some time. The hardware they chose for both Frontier and the upcoming Aurora systems is GPU-based, although climate models have run on CPUs for decades. So adapting our codes—and it’s a huge amount of code—to run on GPUs represents a significant change and it took roughly five years to accomplish. 

In general, atmospheric modeling is divided into two components: the dynamical core [how air moves and the atmosphere behaves] and the physical parameterizations. We began by focusing on the dynamical core, spending about two years exploring various approaches, programming models, and parallelization strategies. GPUs have numerous computing cores, necessitating the exposure of a lot of parallelism. Once we figured that out, we switched to a more sophisticated dynamical core suitable for high resolution. The process of porting that dynamical core went much faster, as our programming model and parallelization approach had been established.

At that point, the team was well established and had a good understanding of how to work with GPUs, so we turned our focus to the second component, the physical parameterizations, often referred to as physics. The team spent about a year rewriting this code to run on GPUs.

The final stage involved developing the driver, which integrates all the components, handles I/O, and communicates with the land, ice, and ocean components through the coupler.

Something that’s unique to our project is that the code is written in C++… Climate models are usually developed in Fortran, and while there are approaches to get Fortran to use GPUs, we didn’t think they were mature enough so we ended up rewriting the code in C++. And then we use a performance portability layer called Kokkos that allows us to support both GPUs and CPUs. 

We’re quite satisfied with the performance and deployability of our models, as we have maintained CPU performance on various CPU systems, including those with ARM chips and more traditional Intel and AMD-based CPUs. At the same time, our code is running on GPUs from two different vendors, AMD and Nvidia, each with their distinct programming models. We also hope to be running on Intel GPUs soon.

We hope that our approach is robust enough to accommodate new architectures down the road, but since it is difficult to predict upcoming developments, we can’t guarantee that.

David: Another thing worth mentioning is that the ability to manage the data from our new models is not advancing at the same rate the computational hardware is. One significant challenge that many simulation groups, including ours, face is managing the vast amount of output generated by these models. We need to find ways to make the volume of data more manageable. One approach we’d like to implement is running ensembles of individual simulations all at once and then collecting statistics from that group of simulations to create more compact output.

Q: What are some of the big challenges faced by the E3SM project as you dive deeper into exascale computing?

David: Finding individuals who understand both the science and computational systems required to write code for simulation models can be challenging. We often need to train them ourselves because it’s not as simple as assigning a programmer to create an object. Each team member needs to have a solid understanding of the science and a deep understanding of the machine.

Mark leads our computational science group, which consists of a small, tight-knit group of experts totaling around 20 members. It has taken us 10 years to assemble this team, and we would find it difficult to replace any one of them because each possesses a unique combination of aptitudes, experience, and skills that are not easily replaceable.

Q: That’s a great segue into the “I Am HPC” theme for SC23. What advice would you give to students or other researchers interested in getting involved with the E3SM program?

Mark: For E3SM model development, Dave is right that scientists with domain expertise who also enjoy computing and software development are the best fit for this type of work. The level of computer and computational science knowledge required to work with these models today is significantly greater than when I started, so I would recommend both a background in mathematical modeling and the software engineering skills that make it easier to work with large complex codes.

Q: How do you see the findings and results of the E3SM project benefiting society, and what are some practical applications of its research?

David: Climate modeling as a whole, not just our models specifically, is incredibly important right now. Although these projections—and it’s crucial to note that they are projections rather than predictions—are imperfect, the output from various models will help policymakers make more informed decisions in the face of uncertainty. 

The fact that the climate is changing is widely recognized, and there are all kinds of people who need to make decisions based on timeframes of 10 to 30 years. For example, they need to decide where to build infrastructure or what size of heating and cooling systems to install in buildings. To be more useful, models must improve and become more location-specific and time-specific. That’s the direction we’re heading towards.

Mark: In a model development project, you try and make the best model you can, and you have your hands full just trying to do that. Fortunately, there’s a highly skilled community of model developers building models used by climate scientists for hypothesis testing, so we are making important progress.

Learn More

To learn more about the E3SM model and research, please visit the E3SM website.

Stay Up to Date

Sign up to receive the SC newsletter in your inbox.

Information provided is treated in accordance with ACM & IEEE privacy policies.

Back To Top Button