AI Accurately Classifies Pancreatic Cysts

American College of Surgeons

Key Takeaways

  • MRI and CT scans of nearly 1,000 adults were evaluated by ChatGPT-4 and the traditional manual approach for pancreatic cysts.

  • The accuracy of AI was equivalent to human performance in identifying and classifying nine clinical variables used to monitor pancreatic cyst progression.

CHICAGO — Artificial intelligence (AI) models such as ChatGPT are designed to rapidly process data. Using the AI ChatGPT-4 platform to extract and analyze specific data points from the Magnetic Resonance Imaging (MRI) and computed tomography (CT) scans of patients with pancreatic cysts, researchers found near-perfect accuracy when compared directly against the manual approach of chart review performed by radiologists, according to a study published in the  Journal of the American College of Surgeons (JACS) .

"ChatGPT-4 is a much more efficient approach, is cost effective, and allows researchers to focus on data analysis and quality assurance rather than the process of reviewing chart after chart," said study coauthor Kevin C. Soares, MD, MS, a hepatopancreatobiliary cancer surgeon at Memorial Sloan Kettering Cancer Center in New York City. "Our study established that this AI approach was essentially equally as accurate as the manual approach, which is the gold standard."

Using an existing database of nearly 1,000 adult patients with pancreatic lesions under surveillance between 2010 and 2024 at Memorial Sloan Kettering Cancer Center in New York City, ChatGPT-4 was deployed to identify nine clinical variables used to monitor cyst progression: cyst size, main pancreatic duct size, number of lesions, main pancreatic duct dilation, branch duct dilation, presence of solid component, calcific lesion, pancreatic atrophy, and pancreatitis. Pancreatic cysts are common and require ongoing surveillance because some develop into cancer and require surgery.

Researchers evaluated ChatGPT-4's ability to identify and classify these nine factors associated with increased risk for dysplasia and cancer. A manually annotated institutional cyst database was used as the standard for comparison.

Key Findings

  • The study involved 3,198 unique MRI and CT scans from 991 patients under long-term surveillance for premalignant lesions.

  • ChatGPT-4 successfully extracted clinical variables with high accuracy. The accuracy rate ranged from 97% for a solid component, a high-risk variable, to 99% for calcific lesions.

  • Accuracy was 92% for cyst size and 97% for main pancreatic duct size, other high-risk variables that may indicate cancer and require surgical resection, biopsy, or endoscopic ultrasound.

"AI can help us expand medical research and improve patient outcomes," Dr. Soares said. "The question I get asked most often is, 'What is the chance that this cyst is going to develop into cancer?' We now have an efficient way to look at the MRI and CT scans of thousands of patients and give our patients a better answer. This approach goes a long way to reduce anxiety and help patients feel more confident about their treatment decisions."

While this was a proof-of-concept study, moving forward the study authors say they would like to use AI to expand the number of research questions they ask to enhance patient care.

"There is a lot of interest in understanding if AI can predict who is going to develop cancer. It's important to understand who progresses and why, so we have a better chance at tailoring surveillance," Dr. Soares said. "We want to limit the number of patient visits, costs to the health care industry, and ultimately provide a customized, rather than one-size-fits-all approach to surveillance."

The researchers caution that the study used only one AI source, ChatGPT-4, and results are limited to the data that was used. AI can only work with the information that is handed to it. These limitations may reduce the broader applicability of the findings.

Coauthors are Ankur P. Choubey, MD, MPH; Emanuel Eguia, MD, MS; Alexander Hollingsworth, MS; Subrata Chatterjee, PhD; Remo Alessandris, MD; Misha T. Armstrong, MD, MPH; Emily Manin, MD; Lily V. Saadat, MD; Jennifer Flood, MSN; Avijit Chatterjee, PhD; Vinod P. Balachandran, MD, FACS; Jeffrey A. Drebin, MD, PhD, FACS; T. Peter Kingham, MD, FACS; Michael I. D'Angelica, MD, FACS; William R. Jarnagin, MD, FACS; Alice C. Wei, MD, MSc, FACS; Vineet S. Rolston, MD; Mark A. Schattner, MD; and Richard K. G. Do, MD, PhD.

The study is published as an  article in press  on the JACS website. 

Author Disclosures: This work was funded in part by NIH/National Cancer Institute P30 CA008748 Cancer Center Support Grant.

Citation: Large Language Models Enable Accurate Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance. Journal of the American College of Surgeons. DOI: 10.1097/XCS.0000000000001478

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.