Claude Mythos Empowers Hacker With New Attack Tools

The AI company Anthropic has presented a much-discussed AI model that independently identifies - and, in some cases, also exploits - security gaps in software. ETH Professor of Cyber Security Florian Tramèr explains what the "Claude Mythos Preview" model can do, where the risks lie, and what this model means for cyber security.

A montage featuring a portrait of Florian Tramer in the foreground. In the background is an image of a security lock with a fingerprint. — Florian Tramèr is a professor of computer science at ETH Zurich. His research focuses on cybersecurity, machine learning, and cryptography. (Image: ETH Zurich)

Professor Tramèr, with "Claude Mythos Preview", Anthropic has presented a new AI model that is causing a sensation around the world. What exactly is this model?

Florian Tramèr: Claude Mythos Preview is the next generation of the Claude models from Anthropic - in simple terms, it can be described as the counterpart to ChatGPT from its competitor Anthropic. There have been huge advances in these large language models in recent months, particularly when it comes to programming. The better the models can understand and write computer code, the better they can also recognise errors - or "bugs" - in the code. These bugs can be harmless but, in some cases, can be used as a way to attack computer systems. The previous model, Claude Opus 4.6, had already identified a surprisingly large number of security-relevant errors in complex software such as the operating system Linux or the web browser Firefox. Here, it appears that Claude Mythos goes a significant step further.

Where does this step lead? What's so special about this model?

According to Anthropic, you can give the model huge quantities of source code and then simply say: "Look for security gaps." At some point, it will actually find something. In the past, this work called for highly specialised experts or very elaborate tools. In this case, it appears that comparatively little human expertise is needed. It's impressive - and, at the same time, worrying.

Why is it worrying?

Because it makes cyber attacks easier. A single hacker can suddenly try out thousands of variants. If one attack fails, he or she can simply try with the next one. This increases the risks for companies, state institutions or even private individuals. Particularly if such models become cheaper and more efficient.

Is Claude Mythos a "cyber weapon"?

I wouldn't go that far. Even if a model identifies a vulnerability, the steps between that and a functioning attack are often complex. There are many protective mechanisms that must be bypassed, and this still requires specialist knowledge. However, these models could help less-experienced attackers achieve much more than before. That's precisely where the danger lies.

Do such models ultimately strengthen the defenders or the attackers?

That's the big question that remains to be answered. In the best-case scenario, these AI‑systems would identify many critical gaps before they were discovered by criminal hackers. That would make software more secure overall. In the worst-case scenario, however, there are so many new vulnerabilities that the defenders can no longer keep up with fixing them. One key problem is closing the gaps.

What do you mean? Surely that's why we have specialists.

Security gaps are often closed in a software update. However, many people and organisations either don't install updates or install them very late. If many more critical security gaps are suddenly discovered, there's a greater risk that systems that haven't been updated are open to attack on a massive scale.

Anthropic itself is talking about a "quantum leap" in cyber security. Do you share that assessment?

It's difficult to assess that right now because we still don't have enough information. Many of the identified security gaps haven't been publicly documented yet because they need to be closed first. Only then can we truly evaluate how serious the identified gaps are. One thing is clear, however: Claude Mythos has found gaps that, in some cases, had gone undiscovered with previous tools for years.

Anthropic has decided not to publish Claude Mythos but rather to make it available only to selected partners. Is that sensible in your view?

As yet, Anthropic itself doesn't seem to know the exact extent of the model's capabilities. In a situation like this, it may be sensible to make the model available only to security professionals for the time being in order to gain some experience. In the best-case scenario, it may transpire that the risks are manageable - and the model could then be published at some point. If the security gaps are actually serious, it will take time to identify and rectify particularly critical vulnerabilities.

Critics accuse Anthropic of fear mongering. Is this criticism justified?

OpenAI followed a similar strategy with earlier models - in retrospect it seem exaggerated. But today we're seeing how quickly AI has developed. Models that could barely write texts a few years ago are now uncovering security gaps in operating systems. Whether Anthropic is being overly cautious will only become clear in hindsight. However, the fact that big companies like Google or Microsoft are participating in this project suggests there's more to it than just marketing.

Anthropic is only granting access to Claude Mythos to US tech companies. Does this approach not strengthen the USA's hegemony in terms of AI?

There's a risk of that, yes. Anthropic appears to be pursuing a highly US‑centric security logic, with Europe playing only a minor role. Models of this kind are potentially relevant to national security, intelligence services and the military. If access remains limited to US stakeholders, these stakeholders will have a head start for some time. In the development of AI, however, we repeatedly see that other commercial providers or even open source models generally catch up relatively quickly.

How has Anthropic been able to make such huge technological advances?

Anthropic itself says that it didn't train the model specially for cyber security but rather for programming. Programming is particularly well suited to AI training because its results are easy to check: code either works or it doesn't. Moreover, many developers are actively using Claude, which has presumably provided Anthropic with large quantities of valuable training data. On top of that, Claude Mythos is seemingly an extremely large and expensive model - something that has so far lacked commercial viability for the general public.

Will this make cyber security obsolete?

No, I don't think so. But it is changing. People will work less on individual lines of code and more on the software's architecture and design. AI will be one of several tools for checking security. Whether software becomes less secure as a whole remains to be seen. Although we can identify more bugs on the one hand, AI is also being used to write much more software - which often contains new security gaps. So, it's something of a race.

What does this mean for everyday life?

The basic rules of security still apply: update software on a regular basis, be careful with access rights, and don't install any unknown programs. What changes is the amount of effort involved. Attacks are becoming more targeted, numerous and sophisticated. For companies that develop or run software, security is becoming even more important - and more elaborate. Claude Mythos is not a complete paradigm shift - but it sends a clear signal: the ground rules of cyber security are changing.

Thank you for talking to us.

You're welcome.

Newsletter subscription

Get the latest ETH News everyday

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

Newsletter subscription

You might also like