AI Scientists Secured by Agents

Songshan Lake Materials Laboratory

A team of researchers from the University of Science and Technology of China and the Zhongguancun Institute of Artificial Intelligence has developed SciGuard, an agent-based safeguard designed to control the misuse risks of AI in chemical science. By combining large language models with principles and guidelines, external knowledge databases, relevant laws and regulations, and scientific tools and models, SciGuard ensures that AI systems remain both powerful and safe, achieving state-of-the-art defense against malicious use without compromising scientific utility. This study not only highlights the dual-use potential of AI in high-stakes science, but also provides a scalable framework for keeping advanced technologies aligned with human values.

The Promise and Peril of AI in Science

In recent years, AI has led a new paradigm for scientific research, transforming how discoveries are made and how knowledge advances. Systems can now propose new synthetic routes for molecules, predict toxicity before drugs reach clinical trials, and even assist scientists in planning experiments. These capabilities are not just speeding up routine work but reshaping the foundations of scientific research itself.

Yet with this promise comes peril. Just as AI can suggest how to make life-saving medicines, it can also reveal ways to synthesize highly toxic compounds or identify new routes for banned chemical weapons. Large language models (LLMs) are advanced AI systems trained on massive collections of text, beyond generating human-like responses, they can also act as agents that plan steps, reason through problems, and call external tools to complete complex tasks. This agentic capability has accelerated progress in many areas of science, but it also raises new risks: because LLMs operate through natural language, potentially dangerous information may be only a well-crafted prompt away.

"AI has transformative potential for science, yet with that power comes serious risks when it is misused." said the research team. "That's why we build SciGuard that don't just make AI smarter, but also make it safer."

An Agent at the Gate: How SciGuard Works

Although modifying the underlying AI models can introduce safety constraints, such interventions may come at the cost of reduced performance or limited adaptability. Instead, the team develop SciGuard that operates as an intelligent safeguard for AI models. When a user submits a request, whether to analyze a molecule or to propose a synthesis, SciGuard steps in. It interprets intent, cross-checks with scientific guidelines, consults external databases of hazardous substances, and applies regulatory principles before allowing an answer to pass through.

In practice, this means that if someone asks an AI system a dangerous question, such as how to make a lethal nerve agent, SciGuard will refuse to answer. But if the query is legitimate, such as asking about the safe handling of a laboratory solvent, SciGuard can providing a detailed, scientifically sound answer based on its knowledge, curated knowledge bases, and specialized scientific tools and models.

Built as an LLM-driven agent, SciGuard orchestrates planning, reasoning, and tool-use actions like retrieving laws, consulting toxicology datasets, and testing hypotheses with scientific models, and then updates its plan from the results to ensure safe, useful answers.

Balancing Safety with Scientific Progress

One of SciGuard's most important point is that it enhances safety without undermining scientific utility. To put this balance to the test, the team built a dedicated evaluation benchmark called SciMT (Scientific Multi-Task), which challenges AI systems across both safety-critical and everyday scientific scenarios. The benchmark spans red-team queries, scientific knowledge checks, legal and ethical questions, and even jailbreak attempts, providing a realistic way to measure whether an AI is both safe and useful.

In these evaluations, SciGuard consistently refused to provide dangerous outputs while still delivering accurate and helpful information for legitimate purposes. This balance matters. If restrictions are too strict, they could limit innovation and make AI less useful in real-world situations. On the other hand, if the rules are too weak, technology could be misused. By achieving this balance and validating it systematically with SciMT, SciGuard offers a model for integrating safeguards into scientific AI more broadly.

A Framework for the Future and a Shared Responsibility

The researchers emphasize that SciGuard is not just about chemistry. The same approach could extend to other high-stakes domains such as biology and materials science. To support this broader vision, they have made SciMT openly available to encourage collaboration across research, industry, and policy.

The unveiling of SciGuard comes at a time when more people and Governments around the world are worried about using AI responsibly. In science, misuse could pose tangible threats to public health and safety. By providing both a safeguard and a shared benchmark, the team aims to set an example of how AI risks can be mitigated proactively.

"Responsible AI isn't only about technology, it's about trust," the team said. "As scientific AI becomes more powerful, aligning it with human values is essential."

The research has been recently published in the online edition of AI for Science, an interdisciplinary and international journal that highlight the transformative applications of artificial intelligence in driving scientific innovation.

Reference: Jiyan He et al 2025 AI Sci. 1 015002

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

You might also like