Domain Unlearning Boosts Safety, Control in AI Models

Tokyo University of Science

Vision-language model (VLM) is a core technology of modern artificial intelligence (AI), and it can be used to represent different forms of expression or learning, such as photographs, illustrations, and sketches. It has high generalization ability, which allows it to accurately recognize objects in images within a domain. However, this generalization ability is at risk. For example, VLM recognizes both real cars and illustrated cars as "cars." If this is installed in a system, there is a risk that a car illustrated in a roadside advertisement may be mistaken for a real vehicle, leading to a serious accident. To put safe and reliable AI into practical use, it is essential to establish technology that appropriately controls learned knowledge according to the application.

Addressing this issue, a team of researchers, led by Associate Professor Go Irie from Tokyo University of Science, has proposed the approximate domain unlearning (ADU) algorithm, which allows VLM to "forget" specific domains so that it cannot recognize them. For example, according to these results, real-life vehicles can be recognized with high accuracy.

This innovative study, co-authored by Mr. Kodai Kawamura and Mr. Yuta Goto from Tokyo University of Science, Japan, Dr. Rintaro Yanagi from National Institute of Advanced Industrial Science and Technology, and Dr. Hirokatsu Kataoka from AIST and University of Oxford, has been presented at the 39^th Conference on Neural Information Processing Systems (NeurIPS 2025), held from November 30 to December 7, 2025.

"AI technologies have long pursued the ability to recognize objects accurately across all domains, as seen in decades of research on domain adaptation and domain generalization. While such versatility remains important, the emergence of VLMs with remarkable domain generalization ability made us realize that this assumption itself deserves to be re-examined. This idea led us to conceive ADU—a new approach that deliberately enables models to forget specified domains, when necessary," explains Prof. Irie.

Notably, the technical difficulty is that the domains cannot be distinguished within the VLM. As different domains overlap in the feature space, it is difficult to select and forget only specific domains. Therefore, in this research, the team introduced a method called Domain Disentangling Loss that promotes separation between domains in the feature space and captures the different domain appearances in each image. Furthermore, by introducing the instance-wise prompt generator, the proposed algorithm reduces the recognition accuracy for unnecessary domains while minimizing the need for them. This allows for flexible AI configuration suited to individual practical scenarios and enables flexible knowledge control that was previously impossible, such as making it impossible to recognize cars in illustrations while maintaining functionality.

Interestingly, ADU introduces a new perspective on risk management. While the concept of managing AI risks has long existed, the very generalization ability of modern AI models can sometimes generate new risks. This research presents a framework for building AI that can be flexibly configured according to individual usage scenarios, ensuring both safety and adaptability.

"While AI performance is becoming more sophisticated, in order to promote sustainable industrial applications, it is necessary to adapt it to practical scenarios. We feel that our system, which allows us to freely control functions, will enable us to provide the world with safe and reliable AI technology," concludes Prof. Irie.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

You might also like