"AI Tools Are Cultural Artifacts, Not Neutral Software"

How to keep information trustworthy when content is increasingly machine-made? EPFL professor Andrea Cavallaro shares his view.

Few forces have the power to shape our lives as profoundly as information. In our societies, publicly shared facts determine how citizens vote, how patients are treated, how communities respond to crises... But where reliable information reinforces trust and cooperation, corrupted information can also erode these bonds.

This has always been true. What has changed dramatically in recent years is the scale and speed at which information can be produced, manipulated, and spread. Artificial intelligence has supercharged the process: today, AI tools are generating and circulating text, images, and video that are increasingly indistinguishable from human-made content. And if the system will not fix itself, who will?

Andrea Cavallaro, professor at EPFL and head of the Laboratory of Multimodal Intelligent Systems, develops the multimodal systems needed to detect hate speech across text, image, audio, and video. He is also one of the instigators of AlignAI, an ambitious EU-funded doctoral network training seventeen PhD candidates across six universities to embed human values within large language models.

The AlignAI project aims to convey human values to AI systems. What does that mean in practice?

The idea is to provide both an intellectual framework and a practical platform to transfer individual and societal values into learning systems. How can we characterize the values and norms that people consider important? How can we define a process to transfer those values into systems? These are the questions we are tackling. At the moment, AI systems are primarily being designed by people with engineering backgrounds. But we are no longer dealing with systems that measure physical properties: large language models (LLMs) interact with humans, and humans are inherently difficult to characterize. That's why AlignAI is primarily populated by non-technical PhD students: social scientists, cognitive psychologists, philosophers…

Values vary enormously across cultures, individuals, contexts... How do you begin to map them?

AlignAI tests its approach across three use cases: education, mental health, and online news consumption. These are domains where the impact of LLMs is already significant, and where the stakes of get-ting alignment right are very high. One of our PhD students is working across all three fields, building a conceptualization of values. We are starting with Europe - which is already a very diverse territory -, using existing legislation as a starting point, be-cause legislation embodies the values that a society considered important enough to codify. We are collaborating with a judge to get the right angle.

People tend to trust software by default. With LLMs, how warranted is that confidence?

Automation bias is a well-known phenomenon: be-cause something is software, we may assume it deserves our trust. But we have to remember that these tools are authored. Someone decided which data to use, how to train the model, and then how to fine-tune it to limit unsafe behaviors. And what counts as safe or unsafe is generally decided by a team of engineers, who impose their choices and, by extension, their underlying biases. All of this is passed on, from the properties of the dataset to the properties of the learned model. I call this distributed authorship. The important thing is to engage users in becoming not just passive customers, but active auditors, probing edge cases, questioning value biases. AI tools are not the neutral technical tools of the previous century, that we could calibrate with-in known operating conditions. They interact with us, we shape their behavior with our prompts. By design, they please us in order to increase engagement. That dynamic is entirely new.

What are some of the challenges about detecting hate speech?

Hateful content can be concealed across different modalities: in video frames, on-screen text, audio, or spoken words. Sometimes the meaning only be-comes clear when you combine these modalities. We have developed systems that cross-reference all of these simultaneously. But hate speech also evolves: it uses coded language, sarcasm, implicit references. It is a moving target, and it erodes trust in systems, in institutions, in democracies.

Aren't these themes a bit political for a technical school?

Another way to say this is that they are cultural. AI tools we use everyday are cultural artifacts: they have been trained on the cultural productions of human beings, primarily of certain cultures, with a large imbalance. They are a compression of digital content produced over decades. Not a dry, neutral piece of software, but a container that absorbs what humanity has created and gives us answers with a fluency that, only a few years ago, we attributed exclusively to highly educated human beings. Recognizing this changes how you design them, how you evaluate them, and how you teach about them. Many of these students will go on to build tools that inter-face with humans. They need to understand what it means to co-design with the people who will actually use the technology, rather than imposing a techno-solutionist view from above.

How can your work reach the tech giants that produce the world's popular LLMs?

In research, we practice open science. Our findings are readily accessible as open source for any developer to verify and, if they deem it useful, to adopt. Hopefully they will. But what gives me particular hope is something I didn't expect. I've been invited to give talks at very large multinational companies on what it means to embed rights and values in AI tools. I was struck by how much this project resonates, even with very technical audiences I assumed wouldn't necessarily be intrigued. Beyond AlignAI, there are research groups around the world who care about designing tools that support the flourishing of humans, not just tools that maximize engagement.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

You might also like