New York, NY [September 4, 2025]—A team of researchers at the Icahn School of Medicine at Mount Sinai has developed a new method to identify and reduce biases in datasets used to train machine-learning algorithms—addressing a critical issue that can affect diagnostic accuracy and treatment decisions. The findings were published in the September 4 online issue of the Journal of Medical Internet Research [DOI: 10.2196/71757].
To tackle the problem, the investigators developed AEquity, a tool that helps detect and correct bias in health care datasets before they are used to train artificial intelligence (AI) and machine-learning models. The investigators tested AEquity on different types of health data, including medical images, patient records, and a major public health survey, the National Health and Nutrition Examination Survey, using a variety of machine-learning models. The tool was able to spot both well-known and previously overlooked biases across these datasets.
AI tools are increasingly used in health care to support decisions, ranging from diagnosis to cost prediction. But these tools are only as accurate as the data used to train them. Some demographic groups may not be proportionately represented in a dataset. In addition, many conditions may present differently or be overdiagnosed across groups, the investigators say. Machine-learning systems trained on such data can perpetuate and amplify inaccuracies, creating a feedback loop of suboptimal care, such as missed diagnoses and unintended outcomes.
"Our goal was to create a practical tool that could help developers and health systems identify whether bias exists in their data—and then take steps to mitigate it," says first author Faris Gulamali, MD. "We want to help ensure these tools work well for everyone, not just the groups most represented in the data."
The research team reported that AEquity is adaptable to a wide range of machine-learning models, from simpler approaches to advanced systems like those powering large language models. It can be applied to both small and complex datasets and can assess not only the input data, such as lab results or medical images, but also the outputs, including predicted diagnoses and risk scores.
The study's results further suggest that AEquity could be valuable for developers, researchers, and regulators alike. It may be used during algorithm development, in audits before deployment, or as part of broader efforts to improve fairness in health care AI.
"Tools like AEquity are an important step toward building more equitable AI systems, but they're only part of the solution," says senior corresponding author Girish N. Nadkarni, MD, MPH , Chair of the Windreich Department of Artificial Intelligence and Human Health , Director of the Hasso Plattner Institute for Digital Health , and the Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai, and the Chief AI Officer of the Mount Sinai Health System. "If we want these technologies to truly serve all patients, we need to pair technical advances with broader changes in how data is collected, interpreted, and applied in health care. The foundation matters, and it starts with the data."
"This research reflects a vital evolution in how we think about AI in health care—not just as a decision-making tool, but as an engine that improves health across the many communities we serve," says David L. Reich MD , Chief Clinical Officer of the Mount Sinai Health System and President of The Mount Sinai Hospital. "By identifying and correcting inherent bias at the dataset level, we're addressing the root of the problem before it impacts patient care. This is how we build broader community trust in AI and ensure that resulting innovations improve outcomes for all patients, not just those best represented in the data. It's a critical step in becoming a learning health system that continuously refines and adapts to improve health for all."
The paper is titled "Detecting, Characterizing, and Mitigating Implicit and Explicit Racial Biases in Health Care Datasets With Subgroup Learnability: Algorithm Development and Validation Study."
The study's authors, as listed in the journal, are Faris Gulamali, Ashwin Shreekant Sawant, Lora Liharska, Carol Horowitz, Lili Chan, Patricia Kovatch, Ira Hofer, Karandeep Singh, Lynne Richardson, Emmanuel Mensah, Alexander Charney, David Reich, Jianying Hu, and Girish Nadkarni.
The study was funded by the National Center for Advancing Translational Sciences and the National Institutes of Health.
For more Mount Sinai artificial intelligence news, visit: https://icahn.mssm.edu/about/artificial-intelligence .