Tech Slashes Deep Learning Optimization Time by 50%

Abstract

We introduce Bayesian code diffusion, a new deep learning program optimization strategy devised to accelerate the auto-tuning process of deep learning compilers. By using the concepts of prior and posterior distributions in the Bayesian framework and reformulating them to the context of deep learning program optimization, the proposed approach efficiently searches for optimal program code in a significantly reduced search space through an iterative diffusion of program code. To further enhance the efficiency of program optimization, we propose pre-training and fine-tuning of the cost model, which improves both the model's predictive accuracy and training efficiency. We implement Bayesian code diffusion in Ansor and evaluate its performance on a wide range of deep learning models on both CPUs and GPUs. Existing approaches struggle to reliably generate high-performing deep learning programs, i.e., achieving low program execution latency, across various configurations, including diverse deep learning model architectures and hardware platforms (CPU and GPU). In contrast, Bayesian code diffusion reduces the end-to-end compilation (optimization) time required to generate the equivalent program execution latency on various setups, e.g., achieving up to 3.31× optimization speedup. This substantial improvement demonstrates that Bayesian code diffusion performs efficient and principled deep learning program optimization across a wide range of deep learning models, operators, and hardware (CPU and GPU).

Researchers at UNIST have introduced a groundbreaking method to drastically reduce the time needed for optimizing deep learning programs.

Led by Professor Seulki Lee in the Department of Computer Science and Engineering, the team's innovative approach has been accepted for presentation at the prestigious Operating Systems Design and Implementation (OSDI) conference-one of the most influential events in computer systems research. This achievement is particularly notable, as only a handful of papers with Korean first authors have been selected over the conference's 20-year history.

Transforming AI models into executable programs involves a crucial step called compilation, where high-level code is converted into machine instructions that hardware such as GPUs and CPUs can understand. Auto-tuning, a process that searches through countless code configurations to find the most efficient setup, is central to this step. However, traditional auto-tuning can take a lot of time-sometimes hours-and consumes substantial power.

The team focused on the repetitive calculations common in deep learning models. By sharing information among similar operations, they narrowed the search space for optimal code configurations. Instead of starting from scratch each time, they re-used previous results, dramatically speeding up the tuning process.

Applying this approach to the widely-used Ansor auto-tuning framework, the researchers achieved an average speedup of 2.5 times on CPUs and 2 times on GPUs-cutting the optimization time by more than half without compromising performance.

The concept of Bayesian code diffusion Figure 1. The concept of Bayesian code diffusion: a sufficiently optimized prior parameter of one subgraph is propagated to similar subgraphs (posteriors). Then, posterior parameters are derived and refined from the prior for each subgraph via code diffusion using a Bayesian formulation, enabling efficient deep learning program optimization.

Professor Seulki Lee explained, "Reducing compilation time not only makes better use of computational resources but also lowers power consumption, contributing to more efficient AI development."

This research was led by graduate researcher Je-su Jeong and supported by the Institute for Information & Communications Technology Planning & Evaluation (IITP) under the Ministry of Science and ICT.

OSDI, held annually alongside the Symposium on Operating Systems Principles (SOSP), is one of the foremost conferences in the field of systems research. Past notable AI contributions, including Google's TensorFlow, have been introduced there. This year, out of 338 submissions, only 48 papers were accepted, including works from UNIST and Seoul National University. The conference took place in Boston, United States, from July 7 to 9, 2025.

Journal Reference

Isu Jeong and Seulki Lee, "Bayesian Code Diffusion for Efficient Automatic Deep Learning Program Optimization," OSDI '25, (2025).

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.