
< From left: Professor Jongse Park , M.S candidate Jaehong Cho, M.S candidate Hyunmin Choi, Professor Brandon Reagen ISPASS >
Operating Large Language Model (LLM) services like ChatGPT requires a server infrastructure on the scale of tens of thousands of units. However, constructing actual equipment every time a new AI semiconductor or system architecture needs to be verified incurs massive costs and time. A research team at our university has developed a 'virtual testbed' that can pre-verify performance and efficiency inside a computer before building an actual large-scale AI server.
KAIST announced on May 29th that the research on a Large Language Model (LLM) serving infrastructure simulator (virtual testing software) developed by Professor Jongse Park's research team in the School of Computing won the Best Paper Award at 'ISPASS 2026 (IEEE International Symposium on Performance Analysis of Systems and Software),' a world-renowned conference in the field of computer system performance analysis.
'LLMServingSim 2.0,' developed by the research team, is a simulation platform capable of virtually analyzing various hardware and software combinations in complex AI service environments. Researchers and developers can freely experiment with various design options and verify performance without having to directly build expensive, large-scale server infrastructures.

< LLMServingSim 2.0 is workload >
In particular, this technology is drawing attention because it goes beyond the existing Graphics Processing Unit (GPU)-centric environment to support diverse hardware environments, including Neural Processing Units (NPUs), which are rising as next-generation AI semiconductors, and Processing-In-Memory (PIM, a semiconductor technology that performs operations inside the memory).
In other words, it is a technology that allows future-oriented AI semiconductors that have not yet been commercialized to be tested in advance within a virtual datacenter environment. Through this, it is possible to replicate and analyze inside a computer how much the service speed improves, how much power consumption is reduced, and whether it operates stably even in a server environment scaled to tens of thousands of units when a specific semiconductor is applied.
In addition, it reproduces complex operations that occur during actual AI service operations—such as data processing, request distribution, and memory utilization—at the system level, enabling performance evaluations that are close to reality. Notably, it can even analyze disaggregated infrastructure environments where multiple server resources are separated and connected for use, showing great potential for utilization in next-generation AI datacenter research.
This simulator is expected to be widely utilized not only by researchers but also by LLM service companies and AI semiconductor startups to design and optimize next-generation AI infrastructures. This is because it can rapidly verify new AI semiconductors or service architectures prior to actual construction, thereby significantly reducing the cost and time of AI infrastructure development.

< Research Image (AI-generated image) >
Professor Jongse Park said, "The competitiveness of AI services is determined not only by the model itself but also by the infrastructure technology that operates it stably and efficiently." He added, "We hope this simulator will serve as an important foundation for researchers and the industry to develop next-generation AI infrastructures faster and more efficiently."
This research was led by M.S candidate Jaehong Cho and Hyunmin Choi in the School of Computing as co-first authors. Following their Best Paper Award at the 2024 IISWC (IEEE International Symposium on Workload Characterization), the research team won the Best Paper Award again at this ISPASS 2026, proving their research competitiveness in the field of AI infrastructure once more.
※ Paper Title: LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure, DOI: 10.1109/ISPASS69572.2026.00012 (Authors: Jaehong Cho, Hyunmin Choi, Guseul Heo, Jongse Park) ※ Open Source Link: https://llmservingsim.ai/ Meanwhile, this research was conducted with support from the Ministry of Science and ICT (MSIT), the Institute for Information & Communications Technology Planning & Evaluation (IITP, No. RS-2024-00396013), the Electronics and Telecommunications Research Institute (ETRI, No. RS-2025-02305453), and SK hynix.