KAIST Unveils AI Inference-Scaling Method

Korea Advanced Institute of Science and Technology

(왼쪽부터) 안성진 교수, 윤재식 박사과정, 조현서 석사과정, 백두진 석사과정, 요슈아 벤지오 교수

<(From Left) Professor Sungjin Ahn, Ph.D candidate Jaesik Yoon, M.S candidate Hyeonseo Cho, M.S candidate Doojin Baek, Professor Yoshua Bengio>

<Ph.D candidate Jaesik Yoon from professor Ahn's research team>

Diffusion models are widely used in many AI applications, but research on efficient inference-time scalability*, particularly for reasoning and planning (known as System 2 abilities) has been lacking. In response, the research team has developed a new technology that enables high-performance and efficient inference for planning based on diffusion models. This technology demonstrated its performance by achieving a 100% success rate on an giant maze-solving task that no existing model had succeeded in. The results are expected to serve as core technology in various fields requiring real-time decision-making, such as intelligent robotics and real-time generative AI.

*Inference-time scalability: Refers to an AI model's ability to flexibly adjust performance based on the computational resources available during inference.

KAIST (President Kwang Hyung Lee) announced on the 20th that a research team led by Professor Sungjin Ahn in the School of Computing has developed a new technology that significantly improves the inference-time scalability of diffusion-based reasoning through joint research with Professor Yoshua Bengio of the University of Montreal, a world-renowned scholar in deep learning. This study was carried out as part of a collaboration between KAIST and Mila (Quebec AI Institute) through the Prefrontal AI Joint Research Center.

This technology is gaining attention as a core AI technology that, after training, allows the AI to efficiently utilize more computational resources during inference to solve complex reasoning and planning problems that cannot be addressed merely by scaling up data or model size. However, current diffusion models used across various applications lack effective methodologies for implementing such scalability particularly for reasoning and planning.

To address this, Professor Ahn's research team collaborated with Professor Bengio to propose a novel diffusion model inference technique based on Monte Carlo Tree Search. This method explores diverse generation paths during the diffusion process in a tree structure and is designed to efficiently identify high-quality outputs even with limited computational resources. As a result, it achieved a 100% success rate on the "giant-scale maze-solving" task, where previous methods had a 0% success rate.

In the follow-up research, the team also succeeded in significantly improving the major drawback of the proposed method—its slow speed. By efficiently parallelizing the tree search and optimizing computational cost, they achieved results of equal or superior quality up to 100 times faster than the previous version. This is highly meaningful as it demonstrates the method's inference capabilities and real-time applicability simultaneously.

Professor Sungjin Ahn stated, "This research fundamentally overcomes the limitations of existing planning method based on diffusion models, which required high computational cost," adding, "It can serve as core technology in various areas such as intelligent robotics, simulation-based decision-making, and real-time generative AI."

The research results were presented as Spotlight papers (top 2.6% of all accepted papers) by doctoral student Jaesik Yoon of the School of Computing at the 42nd International Conference on Machine Learning (ICML 2025), held in Vancouver, Canada, from July 13 to 19.

※ Paper titles: Monte Carlo Tree Diffusion for System 2 Planning (Jaesik Yoon, Hyeonseo Cho, Doojin Baek, Yoshua Bengio, Sungjin Ahn, ICML 25), Fast Monte Carlo Tree Diffusion: 100x Speedup via Parallel Sparse Planning (Jaesik Yoon, Hyeonseo Cho, Yoshua Bengio, Sungjin Ahn)

※ DOI: https://doi.org/10.48550/arXiv.2502.07202, https://doi.org/10.48550/arXiv.2506.09498

This research was supported by the National Research Foundation of Korea.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.