Frontier Users’ Exascale Climate Emulator Nominated for Gordon Bell Climate Prize

© Carlos Jones, Oak Ridge National Laboratory
© Carlos Jones, Oak Ridge National Laboratory

A multi-institutional team of researchers led by the King Abdullah University of Science and Technology, or KAUST, Saudi Arabia, has been nominated for the Association for Computing Machinery’s 2024 Gordon Bell Prize for Climate Modelling. The team developed an exascale climate emulator with radically enhanced resolution but without the computational expense and data storage requirements of state-of-the-art climate models.

This is the fourth time since 2022 KAUST researchers have been nominated for a Gordon Bell Prize. Team members also include researchers from the National Center for Atmospheric Research, the University of Notre Dame, NVIDIA, Saint Louis University and Lahore University of Management Sciences.

The winners will be announced at this year’s Supercomputing Conference in Atlanta, Georgia, Nov. 17 to 22.

“Climate models are incredibly complex and can take weeks or months to run, even on the fastest supercomputers. They generate massive amounts of data that become nearly impossible to store, and it’s becoming a bigger and bigger problem as climate scientists are constantly pushing for higher resolution,” said Marc Genton, Al-Khwarizmi distinguished professor of statistics at KAUST. Genton has been designing the emulator’s algorithm for nearly a decade.

“The climate emulator solves two problems: speeding up computations and reducing storage needs,” Genton added. “It’s designed to mimic model outputs on demand without storing petabytes of data. Instead of saving every result, all we have to store are the emulator code and necessary parameters, which, in principle, allows us to generate an infinite number of emulations whenever we need them.”

Earth system models, or ESMs, are supercomputer programs used to calculate changes in the atmosphere, oceans, land and ice sheets. The simulations are based on the quantifiable laws of physics and are some of the most computationally demanding calculations to perform in terms of complexity, power consumption and processing time. Nevertheless, ESMs are essential tools for predicting the impacts of climate change.

“Nowadays, typical global climate models have a resolution of up to 25 kilometers, which is great, but if you want to know the wind conditions and observe storms over a small city, for example, you need much greater resolution in space and time,” Genton said. “So, what we’ve done is fit a statistical model that reproduces the information without the underlying laws of physics to essentially mimic the output of the ESMs.”

A research team led by the King Abdullah University of Science and Technology used the Frontier and Summit supercomputers to help them develop a climate emulator that offers radically enhanced resolution without the need to store massive amounts of data. Credit: KAUST

Less is more

By leveraging the latest advances in graphics processing units, or GPUs, hardware and mixed-precision arithmetic, the team’s climate emulator offers a remarkable resolution of 3.5 kilometers (approximately 2.2 miles) and can replicate local conditions on a timescale from days to hours.

Mixed precision is a computational technique that uses double, single and half precision. This technique involves calculating out to 15 to 17 decimal digits for double precision, 6 to 9 decimal digits for single precision and 3 to 4 decimal digits for half precision. Double precision is the most accurate but also the most computationally demanding of the three precisions.

“Using mixed precision to improve performance is something rather innovative in the field that also helps us preserve the emulator’s accuracy,” said Sameh Abdulah, a high-performance computing research scientist at KAUST and the first author of the team’s latest study.

“Not every element in the simulation needs to be calculated in double precision. For example, if the temperature outside is 75 degrees, there’s no need to calculate 15 to 17 places past the decimal point to let you know whether you need to take a jacket with you,” Abdulah added. “Mixing the precision allows us to prioritize the accuracy based on the most important elements, which in turn speeds up the overall calculations.”

The emulator uses a spherical harmonic transform method that converts elements such as temperature, wind and pressure into simple frequency or waveform patterns to more easily describe how they change over time in more than 54 million locations around the globe.

Shown in green, the average resolution of traditional emulators is between 400 and 100 kilometers. The enhanced resolution from 100 to 3.5 kilometers, collected hourly, surpasses traditional emulators by a factor of 245,280. Credit: KAUST

The team has been working on the climate emulator for the past 4 years. This work has focused on developing the algorithms and optimizing the code to run efficiently on supercomputers with various architectures.

The emulator is highly scalable and has demonstrated exceptional performance on four of the world’s top 10 most powerful supercomputers, including the Frontier and Summit supercomputers at the Oak Ridge Leadership Computing Facility. The OLCF is a Department of Energy Office of Science user facility and is located at DOE’s Oak Ridge National Laboratory.

The exascale-class HPE Cray Frontier supercomputer is currently ranked no. 1 on the TOP500 list of fastest supercomputers. The emulator clocked an impressive speed of 0.976 exaflops per second, while leveraging more than 9,025 Frontier nodes and more than 36,000 AMD Instinct GPUs.

Simulations on the IBM Summit supercomputer, currently ranked no. 9 on the TOP500 list, used 3,072 nodes — about 67% of the machine — and 18,432 NVIDIA V100 GPUs to achieve a speed of 0.375 exaflops per second, running at reduced precision. One exaflop is equal to a quintillion, or a billion-billion, calculations per second.

The emulator also performed well on the Alps supercomputer, ranked no. 6, at the Swiss National Supercomputing Centre in Lugano, Switzerland, and on the Leonardo supercomputer, ranked no. 7, at the CINECA data center in Bologna, Italy. The team also made extensive use of KAUST’s Shaheen III, ranked no. 23, the largest and most powerful supercomputer in the Middle East.

“Sustainable computing is another advantage. Getting the answer faster means less storage, which also means saving energy,” Genton said. “Supercomputing requires a lot of energy. By mixing the precision, we reduce the time we need to run, making it more sustainable for climate studies by getting more out of the machine.”

According to Abdulah, the next step the team would like to take is straightforward: “winning the prize.”

In addition to Genton and Abdulah, the list of nominees includes David Keyes, Hatem Ltaief, Yan Song, Gera Stenchikov, and Ying Sun (KAUST); Zubair Khalid (KAUST and Lahore University of Management Sciences); Allison Baker (NCAR); George Bosilca (NVIDIA); Qinglei Cao (Saint Louis University); and Stefano Castruccio (University of Notre Dame).

UT-Battelle manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. The Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.