AI Tool Learns Medical Images With Less Data

University of California - San Diego

A new artificial intelligence (AI) tool could make it much easier—and cheaper—for doctors and researchers to train medical imaging software, even when only a small number of patient scans are available.

The AI tool improves upon a process called medical image segmentation, where every pixel in an image is labeled based on what it represents—cancerous or normal tissue, for example. This process is often performed by a highly trained expert, and deep learning has shown promise in automating this labor-intensive task.

The big challenge is that deep learning-based methods are data hungry—they require a large amount of pixel-by-pixel annotated images to learn, explained Li Zhang, a Ph.D. student in the Department of Electrical and Computer Engineering at the University of California San Diego. Creating such datasets demands expert labor, time and cost. And for many medical conditions and clinical settings, that level of data simply doesn't exist.

To overcome this limitation, Zhang and a team of researchers led by UC San Diego electrical and computer engineering professor Pengtao Xie have developed an AI tool that can learn image segmentation from just a small number of expert-labeled samples. By doing so, it cuts down the amount of data usually required by up to 20 times. It could potentially lead to faster, more affordable diagnostic tools, especially in hospitals and clinics with limited resources.

The work was published in Nature Communications .

"This project was born from the need to break this bottleneck and make powerful segmentation tools more practical and accessible, especially for scenarios where data are scarce," said Zhang, who is the first author of the study.

The AI tool was tested on a variety of medical image segmentation tasks. It learned to identify skin lesions in dermoscopy images; breast cancer in ultrasound scans; placental vessels in fetoscopic images; polyps in colonoscopy images; and foot ulcers in standard camera photos, just to list several examples. The method was also extended to 3D images, such as those used to map the hippocampus or liver.

In settings where annotated data were extremely limited, the AI tool boosted model performance by 10 to 20% compared to existing approaches. It required 8 to 20 times less real-world training data than standard methods while often matching or outperforming them.

Zhang described how this AI tool could potentially be used to help dermatologists diagnose skin cancer. Instead of gathering and labeling thousands of images, a trained expert in the clinic might only need to annotate 40, for example. The AI tool could then use this small dataset to identify suspicious lesions from a patient's dermoscopy images in real time. "It could help doctors make a faster, more accurate diagnosis," Zhang said.

The system works in stages. First, it learns how to generate synthetic images from segmentation masks, which are essentially color-coded overlays that tell an algorithm which parts of an image are, say, healthy or diseased. Then, it uses that knowledge to create new, artificial image-mask pairs to augment a small dataset of real examples. A segmentation model is trained using both. Through a continuous feedback loop, the system refines the images it creates based on how well they improve the model's learning.

The feedback loop is a big part of what makes this system work so well, noted Zhang. "Rather than treating data generation and segmentation model training as two separate tasks, this system is the first to integrate them together. The segmentation performance itself guides the data generation process. This ensures that the synthetic data are not just realistic, but also specifically tailored to improve the model's segmentation capabilities."

Looking ahead, the team plans to make their AI tool smarter and more versatile. The researchers also plan to incorporate feedback from clinicians directly into the training process to make the generated data more relevant for real-world medical use.

Full study: " Generative AI enables medical image segmentation in ultra low-data regimes "

This work was supported by the National Science Foundation (IIS2405974 and IIS2339216) and the National Institutes of Health (R35GM157217 and R21GM154171).

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.