Inside the diseased cell, the genes are in chaos. Some are receiving signals to overproduce a protein. Others are reducing activity to abnormal levels. Up is down and down is up.
The right molecule could restore order, reversing dysregulation in specific genes. But finding the ideal compound could require examining millions of chemicals for their influence on hundreds or thousands of genes.
An MSU-led team of researchers has demonstrated a better way. Using machine learning trained on enormous amounts of published data, they were able to predict how chemicals will influence gene expression, based solely on the structure of the chemical.
Their study, recently published in the journal Cell, discovered compounds that are promising for treatment of two difficult diseases: the most aggressive form of liver cancer and a chronic lung disease with no curative options.
With implications for faster drug discovery, the findings result from years of work across multiple disciplines and institutes, said one of the senior authors, Bin Chen, associate professor at the College of Human Medicine in the departments of Pediatrics and Human Development and Pharmacology and Toxicology .
"So many people worked on this concept. We have over 20 researchers involved, and it's been a long journey," said Chen, PhD, whose research focuses on developing computational methods and tools for drug discovery in collaboration with computer scientists, bench scientists and clinicians.
That interdisciplinary approach was key to this project. It began by training a "Gene expression profile Predictor on chemical Structures," or GPS, on the millions of experimental measurements. Chen collaborated on this phase with another senior author, Jiayu Zhou, PhD, formerly at MSU now at the University of Michigan.
Chen compared the process to training a neural network to classify an image as a person, a cat or a dog.
"In our approach, instead of looking at cats or dogs, we want to know whether the compound is either going to regulate up or down the expression of a specific gene," Chen said. "It's still a classification problem, but more biologically driven."
"But biological data are rarely clean," said Zhou. "Imagine trying to learn from a huge pile of examples where some are clear, some are fuzzy, and some may even be misleading. Our approach helps the model separate stronger signals from weaker ones, so it can learn from the data without being thrown off by all the noise."
After evaluating the data for theoretical application to multiple diseases, the team chose two for real-world testing. Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related death worldwide. Idiopathic pulmonary fibrosis (IPF) is a chronic lung disease with a median survival rate of three years after diagnosis.
Both diseases need new therapeutics, and both hold intense interest for the researchers.
Chen previously found a deworming pill might be used to treat HCC. He and contributing authors Samuel So, MD, the Lui Hac Minh Professor, and Mei-Sze Chua PhD, senior research scientist, at the Asian Liver Center at Stanford University, have long collaborated toward the goal of developing a compound that benefits HCC patients.
"Our previous efforts were limited to repurposing FDA-approved drugs," Chua said. "This new approach greatly expands the pool of novel compounds with potential therapeutic activity in HCC."
"With the incidence of HCC continuing to increase in the USA, novel and more efficacious compounds that can target the molecular heterogeneity of HCC directly addresses an unmet clinical need," said So.
Another senior author from MSU, Xiaopeng Li, PhD, associate professor in the Department of Pediatrics and Human Development in the College of Human Medicine, has focused his research on lung diseases such as IPF.
"We know this disease is hard to tackle," Li said. "There have been so many failures to identify new drugs in the last 20 years. And I think the AI component helped us to probe the problem differently and more systemically."
Discovering compounds in theory is one thing. They still must be validated in the real world, said Edmund Ellsworth, PhD, director of the MSU Medicinal Chemistry Facility and a professor in the Department of Pharmacology and Toxicology.
As a contributor to the study, Ellsworth and his team were responsible for creating related compounds discovered by the platform and optimizing them into safe and effective drugs. This critical step represents just the beginning of a complex process, he said.
"To move forward, it must be recognized that drug discovery is a team sport, and not for the faint of heart," Ellsworth said. "It's complicated, all sorts of things happen, and you need the diversity of experts to overcome and be successful."
The compounds were tested on cell lines in the lab to confirm their influence on genes and to identify leading candidates for testing in living organisms.
When anti-HCC compounds were tested on mice, the team found two new compounds that reduced tumor size. For IPF, the team identified one repurposed drug and two new compounds that showed promise.
Testing compounds for IPF also started with mice, but expanded to samples of human lung tissue, thanks to a clinical-research collaboration with Corewell Health's lung transplant program, located in Grand Rapids.
The program is the busiest in Michigan. And because pulmonary fibrosis is the leading indicator for lung transplants, the program had ample explants to share with researchers for testing as live cultures, said pulmonologist Reda Girgis, MD, medical director of the transplant program and a study contributor.
Girgis, who also is a professor in the College of Human Medicine, said the study illustrates the advancement possible through collaboration between Corewell and MSU.
"I think this is the best way to advance medical knowledge, for clinicians to work side by side with biologists, and now, computational people," Girgis said. "That is really key to advance research."
The team has shared its code and developed a web portal for researchers to use GPS for virtual compound screening.
"It's like a paradigm shift approach for people to drive discovery," Chen said. "And I want more people to test this approach. But most importantly, I want people really to be able to use it to discover new therapeutics."
Li shared that ambition.
"I think it already has been proved that this platform can be applied to two totally different diseases," he said. "So this platform can be used for other diseases, to just unleash the potential."
The research was supported by National Institutes of Health, the National Science Foundation, a Michigan State University Strategic Partnership Grant, Corewell Health-Michigan State University Alliance Corporation, CJ Huang and Ha Lin Yip Foundation to the Asian Liver Center at Stanford University, and the Lui Hac Minh Foundation for Liver Cancer Research.