Fairness Of Algorithms

Eindhoven University of Technology

Important decisions are increasingly being made not by humans, but by algorithms. They determine, for example, who is flagged as a fraudster, or which job applicant receives an invitation for an interview. But these systems are far from objective. How do we ensure that they're fair and don't cause discrimination? That question is central to the research of Hilde Weerts, who completed her PhD with honors at TU/e this week.

Source: Cursor / Martina Silbrníková

Although self-learning models offer advantages over human decisions - such as efficiency and consistency - they're not neutral, says Hilde Weerts, an AI engineer and PhD researcher at the Department of Mathematics & Computer Science faculty. "When developing algorithms, many choices are made by humans. This can lead to bias and discrimination against certain groups. Moreover, these systems are trained with real data, which often already contains bias. This is then copied by the algorithm."

Discrimination

There are numerous examples of discriminatory algorithms. For example, Amazon developed a model that selected male resumes more often than female ones.

"That was because the training materials contained mostly men's resumes because there were already many more men working in the company," Weerts explains. "The model used that to predict who was most likely to be hired." And that touched on an important problem: algorithms learn from historical data, but if that history was unfair, then you perpetuate the inequality in the system.

A solution is less obvious than we might think. "Machine learning systems are built to make the best possible prediction based on inputs. To this end, they actively look for relationships between variables. Even if you remove information such as gender from the data, they can still make a detour to find underlying relationships," Weerts says. "In this case, for example, the model recognized the word 'women' on a resume - because the applicant was a member of a 'women's chess club' - and indirectly used that as a gender indicator."

Interdisciplinary approach

There's lots of literature on how to remove bias from models. However, many of these methods are not efficient and, according to Weerts, miss the mark.

"For example, there was an algorithm that flipped a coin, so to speak, for certain decisions. This means that the model does produce equal outcomes, but mainly because it works equally poorly for all groups - and that doesn't make it fairer."

To understand what fairness really means, Weerts delved into philosophy. "I wanted to know what arguments exist for what counts as fair and what doesn't."

That was quite a step outside her comfort zone as a computer scientist, she admits with a laugh. But it's precisely this broad perspective and interdisciplinary approach that she believes is essential to properly tackle such a complex issue.

In her research, she looks at existing technological solutions from different perspectives - from philosophy and law to social sciences - and tries to identify which methods make sense, and what important questions have remained unanswered so far.

"Many of the existing methods are still little used in practice, because there's a lot of doubt about their effectiveness and about when you should use which method," she explains. She explored how we can apply different methods in such a way that they really make the world a fairer place.

Metrics

To assess how fair an algorithm is, quantitative 'fairness metrics' are used - measurement tools that put into numbers whether the system favors or disadvantages certain groups. There are several ways to go about this.

"With group fairness metrics, for example, you look at the selection percentage of men and women in the predictions, and what the difference is between them. The closer those are to each other, the fairer the model, is the idea," Weerts explains. "But in some cases it's actually important for the false-negative rate to be equal: that is, the probability that someone isn't selected even though they were suitable for the job." So the crucial question is: why choose one measurement tool and not another?

Correcting inequality

"The first thing you actually have to do is carefully examine why there's a difference in your dataset," says Weerts.

You can't correct something until you understand its root cause.
Hilde Weerts

"Take Amazon's selection model: it selected men more often than women, simply because more men had been hired in the past. But what's the exact cause of that? Did more men apply, perhaps? Was there unconscious bias in the selection? Or are there social causes, such as the fact that more men with the right qualifications were available for that job?"

According to Weerts, these are fundamental questions. "There could be many different reasons for the inequality in your data. You have to investigate those first, before you can determine whether - and how - to correct that inequality. You can't correct something until you understand its root cause."

Context-dependent

The legal approach also yielded important insights. "Judges have to assess in practice whether something is discriminatory, which can provide concrete guidance," says Weerts.

She examined how discrimination law is structured and what requirements practitioners - people who develop and use such models - must meet. She also studied case law, i.e., how judges have ruled in previous cases.

"Computer scientists prefer to know exactly which measurement tool to use and what value is 'good enough.' But that's not how it works when it comes to the law," Weerts explains. "It concerns questions like: why was a particular method chosen? Could it have been fairer with a different approach? Is there an objective justification for the difference we see? There are guidelines, but in general it's highly context-dependent and it's much more about logical reasoning than about numbers."

Critical thinking

As a result, one of her main conclusions is that we need to assess situations on a case-by-case basis and think very carefully about the choices we make. It's also essential to consider what problems we want to solve with machine learning in the first place.

"Take the algorithms the Education Executive Agency (DUO) had in place to detect whether students were unfairly receiving a grant intended for those living on their own," she says.

In doing so, the model included factors such as education level and the distance between a student's home address and their parents. Students who were pursuing a secondary vocational education or living close to their parents were more likely to be selected for checks. But these are precisely the characteristics that are relatively common among students with a non-Western migration background. The result: this group was checked significantly more often - which indirectly led to discrimination.

"The way the world works, how data is collected, and what choices people make when training a model all affect the outcome. Those who want to achieve fairer and better results must take all these factors into account - and that requires diligence and critical thinking," she concludes.

So fair algorithms don't start with technology, but with insight, reflection, and a willingness to ask tough questions - and that may be the biggest challenge of all.

/TU/e Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.