By harnessing the power of machine learning, researchers have constructed a framework for analyzing what factors most significantly contribute to a species' genetic diversity.
The study, recently published in the journal Molecular Phylogenetics and Evolution, suggests that the genetic variation of two species, the Brazilian sibilator frog and the granular toad, both amphibians native to northeastern Brazil, were shaped by different processes.
Results showed that the genetic variation in the sibilator frog was shaped mostly by population demographic events in response to habitat changes that occurred over the last 100,000 years. In contrast, genetic diversity in the granular toad was mostly shaped by contemporary landscape factors - toads that are relatively more isolated, either by geographic distance or inhospitable habitat, were more likely to be genetically different.
While previous investigations have explored the effects of historical demographic and landscape factors on genetic diversity of these amphibians, they were conducted with separate sets of data for these factors, making it difficult to discern which was the most important. Now, researchers involved with this paper are the first to use artificial intelligence to consider how both processes shape genetic diversity equally, rather than making manual assumptions about which may have been more vital.
"Prior to this work, we had to ask questions independently because you couldn't investigate both influences in the same framework," said Bryan Carstens, co-author of the study and a professor in evolution, ecology and organismal biology at The Ohio State University. "What AI allows us to do is to simulate processes that are both happening ecologically in the present and during deep-time evolutionary events and compare those findings to the actual data that we collect from these frogs."
Due to the sheer amount of data that's become available to geneticists and other wildlife biologists over the past few decades, it can be challenging for researchers to identify specific factors that might be important in certain experiments, said Carstens. But by integrating large swaths of information into simulations that can account for those elements in a single analysis, it's possible to get a much more complete chronicle of a species' development.
"It takes a long time to build and train our AI models, but we wanted one l that could capture the range of potential variation in the species' histories in a way that was as faithful as we could be to what we knew about the biology of the system," said Carstens.
For example, while the species this study investigated dwell in the same region, there are many differences in their natural histories. Despite both their eggs and larvae being fully aquatic, the sibulator frog reproduces continuously throughout the wet season and in underground chambers, while the granular toad's reproductive events happen explosively because they are dependent on heavy rainfall.
Combined with their machine learning approach, the researchers' simulation determined their model scenarios were 100% supported regarding historical explanations for the sibilator frog's expansion, and over 99% supported for those of the granular toad.
One of the reasons their model is so accurate is due to its ability to account for recent demographic events, including measuring how events like human development or habitat change may have affected animal genetic diversity over a long period of time.
But even when using AI, researchers have to be careful to avoid deceptive patterns in their results, said Carstens.
"No analysis that we do is going to capture every single factor that has been important to these species over millions of years," he said. "So we have to allow for a range of possibilities without making it so broad that essentially any model would be able to fit the data."
That said, as technological strides allow researchers to answer niche ecological questions and test new hypotheses, their work is a precursor to creating an upgraded machine learning framework that could be applied to unique investigations of other species, said Carstens.
"We're likely to continue using different combinations of these AI tools in different ways to try to understand evolutionary history," said Carstens. "And as we keep learning, the tools we're using will change, and they'll evolve to be even better."
Emanuel M. Fonseca, who earned his doctorate from Ohio State in 2022, was a co-author. The study was supported by the Ohio Supercomputer Center, the U.S. National Science Foundation and the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior in Brazil.