Binghamton Univ. Unveils AI Tech to Halt Fake Info

Binghamton University

As chatbots powered by artificial intelligence become more ingrained in our everyday lives, people are increasingly using them to help diagnose their medical concerns.

Should I be worried about this rash? What if this insect bite gets infected? Is this pain the symptom of a larger problem? When dealing with someone's health, the answers need to be as accurate as possible.

Last year, Binghamton University researchers tested Open AI's ChatGPT, and it showed high accuracy in identifying disease terms, drug names, and genetic information. However, the AI bot also generated a high number of false "hallucinations."

A follow-up study funded by a $100,000 grant from New York state's Empire AI Consortium may have found a way to eliminate that confidently delivered but fake information.

Ahmed Abdeen Hamed - a research fellow for the Thomas J. Watson College of Engineering and Applied Science's School of Systems Science and Industrial Engineering - collaborated with George J. Klir Professor of Systems Science Luis M. Rocha to develop an innovative verification method, and the journal STAR Protocols recently published their conclusions.

From plain language to diagnosis

The new protocol harnesses the growing number of open-source AI options, each of which has a different way to arrive at an answer to an inquiry. Hamed and Rocha chose seven of these large language models and forced them to use retrieval-augmented generation (RAG), which required them to reference an authoritative database of medical terminology before giving a response.

Over 10,000 experiments, the seven chatbots all received the same plain-language symptoms, and each of them came up with what it thought were the medical terms for them, complete with an official identification number. Then the bots put the answers up for a "vote."

The result: 76.85% of the answers were supported by at least four LLMs, and the remaining 23.15% were supported by at least two. No unmatched terms - and no hallucinations.

"The new workflow is incredible," Hamed said, "because it can verify anything from a biomedical point of view - biological knowledge with disease and genetics, translational knowledge from diseases to treatments and clinical trials, and also from a healthcare point of view with symptoms and treatments."

A big advantage of this new protocol is that it can be reproduced in a near-infinite number of permutations to reinforce its accuracy.

"There can be 100 large language models that are open source, and every time we can perform an experiment with seven LLMs selected at random from that list," Hamed said. "When we perform the experiment many, many times, we increase the confidence in the voting."

Looking at wider applications

Rocha said the protocol is an important step toward increasing confidence in large multiscale network models of disease, which is a key topic for his Complex Adaptive Systems and Computational Intelligence Lab at Binghamton.

Among the research is the development of "digital twins" for precision medicine. These dynamic, virtual replicas of physical processes are continuously updated using AI and real-time data to create precise, predictive simulations of human reactions, so that healthcare providers can optimize outcomes before real-world testing.

"For instance, the protocol can extract and provide multi-agent verification of evidence for an adverse drug reaction for a given medication that is available in clinical trials, the scientific literature, pharmacological databases, and even social media discourse," Rocha said. "And it can assist in the extraction of evidence at multiple scales, from multiomics to epidemiological and behavioral data sources, which we have already started to pilot by building multi-layer models of ER+ breast cancer."

Hamed hailed the input from his collaborator as essential: "The guidance from Professor Rocha was huge, from securing the grant to helping to decide the direction of where this research would go and coaching us to develop the protocols needed to make it all work."

Although the study centered on biomedical applications, the Binghamton team's discovery could be used to curb or eliminate other kinds of LLM hallucinations, such as fabricated legal citations, fake academic citations, or blatant historical errors.

"This protocol is a big step toward the democratization of knowledge verification," Hamed said.

Beyond Binghamton

With this research, Hamed wraps up his fellowship at Binghamton University and transitions to a new role as a research associate professor at the University of Nebraska-Lincoln.

"Dr. Hamed's period in our lab was most productive, not only in the rapid development of AI-driven workflows and publications, but in catalyzing new, creative ideas for all lab members," Rocha said. "I cannot wait to see the amazing new research he will produce at the University of Nebraska-Lincoln."

Hamed is grateful for the opportunities he received at Binghamton.

"Watson College provided an exceptional environment where I could fully develop and implement the forward‑looking research agenda I began during my time in Europe," he said. "The direction I envisioned was still emerging there at the time, and the fellowship offered the right setting to advance it. I'm hopeful that the resulting peer‑reviewed publications can help shift perspectives and demonstrate how GenAI and LLMs can be used responsibly, constructively, and with genuine innovation."

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

From plain language to diagnosis

Looking at wider applications

Beyond Binghamton

You might also like