A team from the Computer Vision Center (CVC) and the University of Barcelona has published the results of a study that evaluates the accuracy and bias in gender and skin colour of automatic face recognition algorithms tested with real world data. Although the top solutions exceed the 99.9% of accuracy, researchers have detected some groups that show higher false positive or false negative rates.
Face recognition has been routinely utilized by both private and governmental organizations around the world. Automatic face recognition can be used for legitimate and beneficial purposes (e.g. to improve security) but at the same time its power and ubiquity heightens a potential negative impact unfair methods can have for the society (e.g. discrimination against ethnic minorities). Although not sufficient, a necessary condition for a legitimate deployment of face recognition algorithms is equal accuracy for all demographic groups.
With this purpose in mind, researchers from the Human Pose Recovery and Behavior Analysis Group at the Computer Vision Center (CVC) – University of Barcelona (UB), led by Dr. Sergio Escalera, organized a challenge within the European Conference of Computer Vision (ECCV) 2020. The results, recently published in Computer Vision – ECCV 2020 Workshops journal, evaluated the accuracy of the submitted algorithms by the participants on the task of face verification in the presence of other confounding attributes.
The challenge was a success since “it attracted 151 participants, who made more than 1,800 submissions in total, exceeding our expectations regarding the number of participants and submissions” explained Sergio Escalera (UB-CVC).
The participants used an image dataset not balanced, which simulates a real world scenario where AI-based models supposed to be trained and evaluated on imbalanced data (considerably more white males that dark females). In total, they worked with 152,917 images from 6,139 identities.
The images were annotated for two protected attributes: gender and skin colour; and five legitimate attributes: age group (0-34, 35-64, 65+), head pose (frontal, other), image source (still image, video frame), wearing glasses and a bounding box size.
The obtained results were very promising. Top winning solutions exceeded 99.9% of accuracy while achieving very low scores in the proposed bias metrics, “which can be considered a step toward the development of fairer face recognition methods” expounded Julio C. S. Jacques Jr., researcher at the CVC and at the Universitat Oberta de Catalunya. The analysis of top-10 teams showed higher false positive rates for females with dark skin tone and for samples where both individuals wear glasses. In contrast there were higher false negative rates for males with light skin tone and for samples where both individuals are younger than 35 years. Also, it was found that in the dataset individuals younger than 35 years wear glasses less often than older individuals, resulting in a combination of effects of these attributes. “This was not a surprise as the adopted dataset was not balanced with respect to different demographic attributes. However, it shows that overall accuracy is not enough when the goal is to build fair face recognition methods, and that future works on the topic must take into account accuracy and bias mitigation together”, concluded Julio C. S. Jacques Jr.
Sixta T., Jacques Junior J.C.S., Buch-Cardona P., Vazquez E., Escalera S. (2020) FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition. Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science, vol 12540. Springer, Cham. DOI: 10.1007/978-3-030-65414-6_32