JUPITER to Be World's Most Powerful AI Supercomputer

Forschungszentrum Juelich

15 November 2023

JUPITER will be installed at Forschungszentrum Jülich next year as Europe's first exascale supercomputer. At the world's largest supercomputing conference, SC23, which runs from 12 to 17 November in Denver, USA, more detailed information about the technical details of the system have now been announced. In an interview, Prof. Dr. Dr. Thomas Lippert, director of the Jülich Supercomputing Centre, tells us about the hardware that will be used for JUPITER.

The system is set to be the first supercomputer in Europe to break the one million times one million times one million - a "1" followed by 18 zeros - arithmetic operations per second barrier, taking scientific simulations to a new level and enabling breakthroughs in the use of artificial intelligence.

Prof. Dr. Dr. Thomas Lippert, what is so special about the hardware with which JUPITER will break the exascale barrier in 2024?

JUPITER is a dynamic modular supercomputer with two parts: a highly scalable Booster module for particularly compute-intensive problems, which is massively supported by GPUs, and a Cluster module that can be used very universally for all kinds of applications, especially for complex, data-intensive tasks. Both modules can solve scientific problems separately or together, depending on what is required.

As it was just announced at SC23, the Booster module comprises close to 24,000 NVIDIA GH200 GPUs. According to the Linpack benchmark, which is usually used as a reference, JUPITER is set to achieve a performance of just over 1 exaflops and additionally provide outstanding AI computing power.

„With over 90 exaflops at 8 bits, we will have perhaps the fastest AI supercomputer in the world for suitable AI applications!"

PROF. DR. DR. THOMAS LIPPERT

All JUPITER compute nodes are connected to a high-performance network using the latest NVIDIA Mellanox InfiniBand technology. The Booster module is supplied by the French IT company Eviden, formerly ATOS. The cCuster module is equipped with new European ARM CPUs from SiPearl and is supplied by the German company ParTec as an expert in high-performance computing (HPC for short). ParTec is also responsible for the dynamic modular operation of the system. JUPITER uses technologies Made in Europe for both its hardware and many of its software components - this really is something special!

How important is the choice of processors for a supercomputer like JUPITER? With office PCs, the processor type now often only plays a subordinate role...

In high-performance computing, the processor chosen is crucial. This aspect was one of the main criteria for the experts selected by EuroHPC JU when awarding the contract for JUPITER. GPUs - graphics processing units - play a key role here. For certain tasks, they are superior to universal processors - CPUs - by almost one order of magnitude!

The decisive factor here is the high degree of parallelism. Conventional CPUs are designed to process complex tasks very quickly one after the other. CPUs therefore typically have fewer but very powerful processing cores. GPUs, in contrast, have many more processing cores than CPUs, but these are not quite as powerful on their own.

The new NVIDA GH200 GPUs, for example, have many tensor cores per chip, which enable rapid calculations in the field of artificial intelligence. SiPearl, on the other hand, promises a huge memory data rate of 0.5 bytes per flop with its Rhea CPU, which is almost five times as much as a GPU - and therefore offers a high level of efficiency for complex, data-intensive applications.

In fact, the systems network also plays a huge role, especially in dynamic modular operation. The high-performance network is based on the latest-generation NVIDIA Mellanox InfiniBand NDR and uses a topology comprising several groups, into which individual system modules can be mapped, while still maintaining a strong connection between the individual groups.

Critics have recently complained that supercomputers in Germany and Europe are not sufficiently designed for training neural networks for AI. How is JUPITER doing in this respect?

We have been providing AI compute time for scientists at the Jülich Supercomputing Centre for years. In particular, the installation of the GPU-based JUWELS Booster in 2020 - which was at that time Europe's fastest supercomputer - proved to be groundbreaking for the use of AI models. We designed the system for AI applications very early on. The JUWELS Booster has more than 900 compute nodes, each providing 4 GPUs and a very high network bandwidth; the deep neural networks are quite simply a perfect match for these GPUs!

„ Europe has both the necessary computer performance as well as the expertise in software development to be innovative in AI."

PROF. DR. DR. THOMAS LIPPERT

Our users are increasingly training AI models on the system. With the large Foundation models, the entire system is often fully utilized at once. Other European centres involved in EuroHPC JU have also put more supercomputers with GPU accelerators into operation since 2021. JUPITER now represents a further milestone. This means that Europe has both the necessary computer performance as well as the expertise in software development to be innovative in AI. The German Federal Ministry of Education and Research (BMBF) recently published an AI action plan, which also deserves a mention here.

What other applications is JUPITER designed for?

Support for scientific HPC applications was the focus from the very beginning. We designed JUPITER bearing in mind our current user base and the applications expected in the future. In addition to AI applications, there is a large field of HPC-driven sciences that we support with JUPITER. These include highly parallel applications that also benefit from the GPU-based booster module - for example, climate simulations, fluid dynamics simulations, or molecular dynamics simulations. We have a heterogeneous user base with the best applications from many scientific domains. The JUPITER cluster module focuses on applications that require greater serial performance and more memory bandwidth - for example, irregular access patterns of elementary physics simulations. Thanks to JUPITER's modular architecture, applications can also use both components simultaneously and thus utilize computing resources efficiently. A special type of such heterogeneous applications combines conventional HPC simulations with AI methods to increase accuracy and efficiency. With JUPITER, we are ideally positioned for this.

A Deep Dive Into JUPITERs Building Blocks

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.