Researcher Making AI Safer

Javier Rando is undertaking a PhD at ETH Zurich on the safety of artificial intelligence (AI). At the same time, he is working at a tech company with the same goal. He is convinced that if this works, AI could become one of the best technologies ever.

Portrait of Javier Rando
Javier Rando believes in the potential of artificial intelligence. At ETH Zurich and Anthropic, he is conducting research into making AI safer. (Image: Jasmin Frei / ETH Zurich)

Even as a teenager, Javier Rando was fascinated by technology, computers and the idea that robots could one day take over human tasks. The 26 year-old native of Málaga, Spain, was never a gamer, but he did programme a smartphone app on his computer before he even had his own smartphone. Later, his fascination with technology took him to Barcelona to study mathematical engineering and data science. It was there that he first encountered artificial intelligence, particularly through the application of algorithms and machine learning.

It was during his Bachelor's degree that his image of artificial intelligence first began to falter. "I was horrified," he recalls. He had read in the media that courts in the United States were using AI to assess the risk of recidivism among offenders. However, this AI systematically classifies people as more dangerous solely because they are Black. "That changed my focus. I decided to devote my Bachelor thesis to fairness in artificial intelligence," says Rando.

Enormous potential - both positive and negative

After that he got more and more interested in the topic of AI safety. He followed up his Bachelor's with a Master's degree in Computer Science at ETH Zurich. "I chose ETH because I wanted to study at the best university for computer science in Europe," he says. He was fascinated by Professor Florian Tramèr's research on AI safety, so he applied for a PhD. He was accepted and also received a fellowship position at the ETH AI Center.

AI Center Fellows typically work across disciplines, combining different areas of AI. Rando focused on the safety of language models. "Today, language models are the most widely used form of artificial intelligence. Millions of people use them. This means they also pose the greatest risk of causing harm," he says. At the same time, he emphasises that his view of AI is not fundamentally negative: "Artificial intelligence has the potential to become one of the best technologies in human history." It could simplify our lives, such as by developing cures for diseases and much more. But there are many risks along the way.

PhD studies in San Francisco

AI is not yet so advanced that its dangers are particularly great. "But that will change. We are on a dangerous path because the progress is lightning fast and we're developing a very powerful technology," says Rando. Most people are unaware of this because they use AI for harmless purposes. Anyone who asked it for a recipe two years ago and today will not notice a huge difference. But that belies the potential of AI. The risks increase significantly as soon as people begin using AI more and more as so-called agents. Examples of this include instructing an AI to respond to emails automatically. This will be of interest to criminals who can attempt to email the AI a command to disregard the task set by the computer user and to send them their credit card details instead.

Rando sees two fundamental dangers: first, AI is vulnerable to attacks and manipulation. "There's still a long way to go before these loopholes are closed," he says. Second, AI can be misused by people with bad intentions even without manipulation - it could even create instructions for building weapons.

"But the benefits and opportunities outweigh the risks. That's why I'm researching ways to make artificial intelligence safer," says Rando. He is working on this in his doctoral studies and in various professional roles, which have taken him to high-profile AI companies such as OpenAI, Meta and now Anthropic. After spending the first part of his doctoral studies in Zurich, he has been living in San Francisco since early 2025 as an external doctoral candidate, while also working at US AI company Anthropic, the manufacturer of the AI "Claude" which is subject to strict ethical guidelines.

Searching for vulnerabilities

Both in his doctoral studies and at Anthropic, Javier Rando is working on simulating manipulations and attacks on artificial intelligence, searching for vulnerabilities and closing these gaps with appropriate programming. At the same time, the aim is to prevent people with malicious intentions from being able to misuse AI in the first place. "We need to build protective shields around AI," he says.

The aim is for artificial intelligence to understand when humans are asking it to perform harmful actions and to be able to refuse to carry out such tasks. This requires appropriate programming, but not all AI manufacturers consider this to be equally important: "In AI development, the bottom line has to be safety first, not profit." That calls for a regulatory framework. Javier recently contributed his expertise as an advisor to a European Commission working group on developing guidelines for AI manufacturers, which are voluntary for the time being.

Javier Rando's work at Anthropic and his research at ETH Zurich are closely intertwined. His dissertation will consist of several publications, for which he can also rely on the research he has conducted at Anthropic. When asked about the difference between researching at universities and at tech companies, he says, "Both approaches are needed to make the AI of the future sufficiently safe and to be able to reap its rewards." The advantage of being a researcher at a company is that it gives you access to unlimited computing power and you know the company's AI inside out. While the upside of researching at a university is that it enables you to explore and shape research questions more freely and to test solutions that are more risky from an economic perspective.

Over the next few years, Javier Rando plans to continue researching the safety of AI applications at Anthropic while completing his PhD. And after that? It's still unclear what the future holds. "Developments in the field of AI are happening so fast that we can't predict what the world will look like in two years' time. Perhaps in a few years' time, I'll be replaced by AI at work as well," says Rando.

Newsletter subscription

Get the latest ETH News everyday

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.