Gordon Bell Finalist Team Pushes Scale Of Rocket Simulation On El Capitan

Courtesy of LLNL

Researchers used Lawrence Livermore National Laboratory's (LLNL) exascale supercomputer El Capitan to perform the largest fluid dynamics simulation ever - surpassing one quadrillion degrees of freedom in a single computational fluid dynamics (CFD) problem. The team focused the effort on rocket-rocket plume interactions.

El Capitan is funded by the National Nuclear Security Administration's (NNSA) Advanced Simulation and Computing (ASC) program. The work - in part performed prior to the transition of the world's most powerful supercomputer to classified operations earlier this year - is led by researchers from Georgia Tech University and supported by partners at AMD, NVIDIA, HPE, Oak Ridge National Laboratory (ORNL) and New York University's (NYU) Courant Institute.

The paper is a finalist for the 2025 ACM Gordon Bell Prize, the highest honor in high-performance computing. This year's winner - selected from a small handful of finalists - will be announced at the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC25) in St. Louis on Nov. 20.

To tackle the extreme challenge of simulating the turbulent exhaust flow generated by many rocket engines firing simultaneously, the team's approach combined a newly proposed shock-regularization technique called Information Geometric Regularization (IGR), invented and implemented by professors Spencer Bryngelson of Georgia Tech, Florian Schäfer of NYU Courant and Ruijia Cao (now a Cornell Ph.D. student).

Using all 11,136 nodes and more than 44,500 AMD Instinct MI300A Accelerated Processing Units (APUs) on El Capitan, the team achieved better than 500 trillion grid points, or 500 quadrillion degrees of freedom. They further extended this to ORNL's Frontier, surpassing one quadrillion degrees of freedom. The simulations were conducted with MFC, a permissively licensed open-source code maintained by Bryngelson's group. With these simulations, they represented the full exhaust dynamics of a complex configuration inspired by SpaceX's Super Heavy booster.

The simulation sets a new benchmark for exascale CFD performance and memory efficiency. It also paves the way for computation-driven rocket design, replacing costly and limited physical experiments with predictive modeling at unprecedented resolution, according to the team.

Georgia Tech's Bryngelson, the project's lead, said the team used specialized techniques to make efficient use of El Cap's architecture.

"In my view, this is an intriguing and marked advance in the fluid dynamics field," Bryngelson said. "The method is faster and simpler, uses less energy on El Capitan, and can simulate much larger problems than prior state-of-the-art­ - orders of magnitude larger."

The team accessed El Capitan via prior collaborations with LLNL researchers and worked with LLNL's El Capitan Center of Excellence and HPE to use the machine on the classified network. LLNL facilitated the effort as part of system-scale stress testing ahead of El Capitan's classified deployment, serving as a public example of the full capabilities of the system before it was turned over for classified use in support of the NNSA's core mission of stockpile stewardship.

"We supported this work primarily to evaluate El Capitan's scalability and system readiness," said Livermore Computing's Development Environment Group Leader Scott Futral. "The biggest benefit to the ASC program was uncovering system software and hardware issues that only appear when the full machine is exercised. Addressing those challenges was critical to operational readiness."

While the actual compute time was relatively short, the bulk of the effort focused on debugging and resolving issues that emerged at full system scale. Futral added that internal LLNL teams, including those working on tsunami early warning and inertial confinement fusion, have also conducted full-system science demonstrations on El Capitan - efforts that reflect the Lab's broader commitment to mission-relevant exascale computing.

A next-generation challenge solved with next-generation hardware

As private-sector spaceflight expands, launch vehicles increasingly rely on arrays of compact, high-thrust engines rather than a few massive boosters. This design provides manufacturing advantages, engine redundancy and easier transport, but also creates new challenges. When dozens of engines fire together, their plumes interact in complex ways that can drive searing hot gases back toward the vehicle's base, threatening mission success, researchers said.

The solution depends on understanding how those plumes behave across a wide range of conditions. While wind tunnel experiments can test some of the physics, only large-scale simulation can see the full picture at high resolution and under changing atmospheric conditions, engine failures or trajectory shifts. Until now, such simulations were too costly and memory-intensive to run at meaningful scales, especially in an era of multi-rocket boosters, according to the team.

To break that barrier, the researchers replaced traditional "shock capturing" methods - which struggle with high computational cost and complex flow configurations - with their IGR approach, which reformulates how shock waves are treated in the simulation, enabling a non-diffusive treatment of the same phenomenon. With IGR, more stable results can be computed more efficiently.

With IGR in place, the team focused on scale and speed. Their optimized solver took advantage of El Capitan's unified memory APU design and leveraged mixed-precision storage via AMD's new Flang-based compiler to pack more than 100 trillion grid cells into memory without performance degradation.

The result was a full-resolution CFD simulation that ran across El Capitan's full system - roughly 20 times larger than the previous record for this class of problem. The simulation tracked the exhaust from 33 rocket engines emitting exhaust at Mach 10, capturing the moment-to-moment evolution of plume interactions and heat recirculation effects in fine detail.

At the heart of the study was El Capitan's unique hardware architecture, which is equipped with four AMD MI300A APUs per node - each combining CPU and GPU chips directly access the same physical memory. For CFD problems that require simultaneously high memory loads and performant computation, that design proved essential and comparatively harmless compared to the unified memory strategies required by separate-memory systems.

The team conducted scaling tests on multiple systems, including ORNL's Frontier and the Swiss National Supercomputing Centre's Alps. Only El Capitan supports a physically shared memory architecture. The system's unified CPU-GPU memory, based on AMD's MI300A architecture, allowed the entire dataset to reside in a single addressable memory space, eliminating data transfer overhead and enabling larger problem sizes.

"We needed El Capitan because no other machine in the world could run a problem of this size at full resolution without compromises," said Bryngelson. "The MI300A architecture gave us unified memory with zero performance penalty, so we could store all our simulation data in a single memory space accessible by both CPU and GPU. That eliminated overhead, cut down the memory footprint and let us scale across the full system. El Capitan didn't just make this work possible; it made it efficient."

The researchers also achieved an 80-fold speedup over previous methods, reduced memory footprint by a factor of 25 and cut energy-to-solution by more than five times. By combining algorithmic efficiency with El Capitan's chip design, they showed that simulations of this size can be completed in hours, not weeks.

Implications for spaceflight and beyond

While the simulation focused on rocket exhaust, the underlying method applies to a wide range of high-speed compressible flow problems - from aircraft noise prediction to biomedical fluid dynamics, the researchers said. The ability to simulate such flows without introducing artificial viscosity or sacrificing resolution could transform modeling across multiple domains and highlights a key design principle behind El Capitan: pairing breakthrough hardware with real-world scientific impact.

"From day one, we designed El Capitan to enable mission-scale simulations that were not previously feasible," said Bronis R. de Supinski, chief technology officer for Livermore Computing. "We're always interested in projects that help validate performance and scientific usability at scale. This demonstration provided insight into El Capitan's behavior under real system stress. While we're supporting multiple efforts internally - including our own Gordon Bell submission - this was a valuable opportunity to collaborate and learn from an external team with a proven code."

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.