AI Must Unlearn Old Physics for New Discoveries

Sissa Medialab

A study in the Journal of Cosmology and Astroparticle Physics (JCAP) explores how a machine-learning strategy known as transfer learning could dramatically reduce the computational cost of searching for new physics beyond the standard cosmological model — while also revealing an unexpected risk: sometimes AI systems can become too reliant on what they already know.

Artificial intelligence is widely used in cosmology to analyze the universe. But testing theories beyond the standard cosmological model, known as ΛCDM, remains computationally extremely demanding.

Although ΛCDM successfully describes many properties of the universe — from its expansion to the distribution of galaxies — physicists know it is probably incomplete. Recent observations hint that phenomena such as massive neutrinos, modified gravity or evolving dark energy could point toward new physics beyond the current model.

Testing these alternatives requires running huge numbers of high-precision simulations of virtual universes under different physical assumptions, often demanding enormous computational resources.

Transfer learning, basically a shortcut

The new paper investigates whether transfer learning— a technique in which AI systems reuse knowledge acquired from one task to accelerate learning in another — can make this process far more efficient.

In this case, researchers first trained a neural network on simulations based on ΛCDM — this is known as pretraining — and then adapted it to more complex cosmological models that include possible new physics.

"It's basically a shortcut," explains Adrian Bayer a cosmologist at the Flatiron Institute and Princeton University, co-author of the study. "Usually people train the AI directly on the most computationally expensive simulations. What we do instead is first use simpler and less expensive ΛCDM simulations to give the AI an idea of what's happening, and only afterward move to the more complex models."

The idea resembles studying a difficult subject by first reading an introductory textbook. "You first read a basic book to get an idea of the knowledge," says Bayer, "and then move to the really complicated book."

According to Veena Krishnaraj, undergraduate student at Princeton University, first author of the paper, this strategy avoids forcing the AI to "digest everything at once."

The results show that this approach can work remarkably well. In some cases, transfer learning reduced the number of expensive simulations needed by more than a factor of ten.

Negative transfer

But the study also uncovered a more subtle phenomenon known as negative transfer.

Returning to Bayer's textbook analogy, it is a bit like studying medicine from an introductory textbook and then encountering a rare disease whose symptoms resemble a common illness: prior knowledge helps most of the time, but it can also push the reader toward the wrong interpretation.

Something similar can happen with AI systems. Sometimes the effects produced by new physics closely resemble patterns already associated with the standard cosmological model. In these cases, the AI tends to interpret the new information using categories learned during pretraining, making it harder — rather than easier — to recognize genuinely new effects.

The researchers observed this behavior in simulations involving massive neutrinos. Certain effects produced by neutrino mass closely resemble variations associated with an existing ΛCDM parameter known as σ8, which describes how strongly matter clusters across the universe. As a result, the pretrained network initially struggled to distinguish between the two effects.

"The negative transfer is not random. It is driven by underlying physical degeneracies in the model," says Krishnaraj. In other words, different physical parameters can produce very similar observable effects, making it difficult for the AI to disentangle them correctly. "So this is something we need to be aware of and try to mitigate," she concludes.

The work highlights both the promise and the risks of applying "foundation model" strategies — conceptually similar to those behind modern generative AI and large language models — to fundamental physics. As the authors write in the paper, pretraining can accelerate inference, "but may also hinder learning new physics."

For now, the method has been tested on simulations, laying the groundwork for application to real observational data. The researchers see it as a powerful tool for future cosmological surveys, which in the coming years will generate unprecedented amounts of high-precision data about the universe.

The paper "Transfer Learning Beyond the Standard Model" by Veena Krishnaraj, Adrian E. Bayer, Christian Kragh Jespersen, Peter Melchior is now available in JSTAT.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.