Smart AI Agents Realized with Top DB Integration Tech

Korea Advanced Institute of Science and Technology

<(From Left) Engineer Jeongho Park from GraphAI, Ph.D candidate Geonho Lee, Prof. Min-Soo Kim from KAIST>

For a long time, companies have been using relational databases (DB) to manage data. However, with the increasing use of large AI models, integration with graph databases is now required. This process, however, reveals limitations such as cost burden, data inconsistency, and the difficulty of processing complex queries.

Our research team has succeeded in developing a next-generation graph-relational DB system that can solve these problems at once, and it is expected to be applied to industrial sites immediately. When this technology is applied, AI will be able to reason about complex relationships in real time, going beyond simple searches, making it possible to implement a smarter AI service.

The research team led by Professor Min-Soo Kim announced on the 8th of September that the team has developed a new DB system named 'Chimera' that fully integrates relational DB and graph DB to efficiently execute graph-relational queries. Chimera has proven its world-class performance by processing queries at least 4 times and up to 280 times faster than existing systems in international performance standard benchmarks.

Unlike existing relational DBs, graph DBs have a structure that represents data as vertices (nodes) and edges (connections), which gives them a strong advantage in analyzing and reasoning about complexly intertwined information like people, events, places, and time. Thanks to this feature, its use is rapidly spreading in various fields such as AI agents, SNS, finance, and e-commerce.

With the growing demand for complex query processing between relational DBs and graph DBs, a new standard language, 'SQL/PGQ,' which extends relational query language (SQL) with graph query functions, has also been proposed.

SQL/PGQ is a new standard language that adds graph traversal capabilities to the existing database language (SQL) and is designed to query both table-like data and connected information such as people, events, and places at once. Using this, complex relationships such as 'which company does my friend's friend work for?' can be searched much more simply than before.

<Diagram (a): This diagram shows the typical architecture of a graph query processing system based on a traditional RDBMS. It has separate dedicated operators for graph traversal and an in-memory graph structure, while attribute joins are handled by relational operators. However, this structure makes it difficult to optimize execution plans for hybrid queries because traversal and joins are performed in different pipelines. Additionally, for large-scale graphs, the in-memory structure creates memory constraints, and the method of extracting graph data from relational data limits data freshness. Diagram (b): This diagram shows Chimera's integrated architecture. Chimera introduces new components to the existing RDBMS architecture: a traversal-join operator that combines graph traversal and joins, a disk-based graph storage, and a dedicated graph access layer. This allows it to process both graph and relational data within a single execution flow. Furthermore, a hybrid query planner integrally optimizes both graph and relational operations. Its shared transaction management and disk-based storage structure enable it to handle large-scale graph databases without memory constraints while maintaining data freshness. This architecture removes the bottlenecks of existing systems by flexibly combining traversal, joins, and mappings in a single execution plan, thereby simultaneously improving performance and scalability.>

The problem is that existing approaches have relied on either trying to mimic graph traversal with join operations or processing by pre-building a graph view in memory. In the former case, performance drops sharply as the traversal depth increases, and in the latter case, execution fails due to insufficient memory even if the data size increases slightly. Furthermore, changes to the original data are not immediately reflected in the view, resulting in poor data freshness and the inefficiency of having to combine relational and graph results separately.

KAIST research team's 'Chimera' fundamentally solves these limitations. The research team redesigned both the storage layer and the query processing layer of the database.

First, the research team introduced a 'dual-store structure' that operates a graph-specific storage and a relational data storage together. They then applied a 'traversal-join operator' that processes graph traversal and relational operations simultaneously, allowing complex operations to be executed efficiently in a single system. Thanks to this, Chimera has established itself as the world's first graph-relational DB system that integrates the entire process from data storage to query processing into one.

As a result, it recorded world-class performance on the international performance standard benchmark 'LDBC Social Network Benchmark (SNB),' being at least 4 times and up to 280 times faster than existing systems.

Query failure due to insufficient memory does not occur no matter how large the graph data becomes, and since it does not use views, there is no delay problem in terms of data freshness.

Professor Min-Soo Kim stated, "As the connections between data become more complex, the need for integrated technology that encompasses both graph and relational DBs is increasing. Chimera is a technology that fundamentally solves this problem, and we expect it to be widely used in various industries such as AI agents, finance, and e-commerce."

The study was co-authored by Geonho Lee, a Ph.D. student in KAIST School of Computing, as the first author, and Jeongho Park, an engineer at Professor Kim's startup GraphAI Co., Ltd., as the second author, with Professor Kim as the corresponding author.

The research results were presented on September 1st at VLDB, a world-renowned international academic conference in the field of databases. In particular, the newly developed Chimera technology is expected to have an immediate industrial impact as a core technology for implementing 'high-performance AI agents based on RAG (a smart AI assistant with search capabilities),' which will be applied to 'AkasicDB,' a vector-graph-relational DB system scheduled to be released by GraphAI Co., Ltd.

*Paper title: Chimera: A System Design of Dual Storage and Traversal-Join Unified Query Processing for SQL/PGQ *DOI: https://dl.acm.org/doi/10.14778/3705829.3705845

This research was supported by the Ministry of Science and ICT's IITP SW Star Lab and the National Research Foundation of Korea's Mid-Career Researcher Program.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.