New AI Model Can Predict Diseases 20 Years Ahead

University of Copenhagen
illustration of anatomical heart and data flow
"We wanted to explore whether it's possible to develop a method that can handle more than 1,000 diseases simultaneously. Our study shows that it is," says Søren Brunak, Professor at Department of Public Health

Imagine a digital crystal ball capable of predicting which diseases we will develop and how severely we will be affected. That scenario has moved a step closer to reality thanks to an international team of researchers, including participants from the University of Copenhagen, who have developed a new AI model that can estimate which diseases we are likely to encounter in the future.

Just as ChatGPT can predict the next most probable word in a sentence, the researchers demonstrate in a new study that it is possible to build a generative AI model capable of calculating the next most likely diagnosis among more than 1,000 common diseases. The study has just been published in the prestigious scientific journal Nature.

"Today, we survive many diseases that would previously have been fatal. As we grow older, we face a future in which many people suffer from multiple conditions simultaneously. That's why we need to understand how diseases interact," says Søren Brunak, Professor at the University of Copenhagen's Department of Public Health and one of the researchers behind the study.

AI Maps the Highways of Disease Progression

It is a novel development that a method can handle such a large number of diagnoses simultaneously. Until now, researchers and health authorities have typically focused on individual diseases or the interaction between a few diagnoses when projecting future disease trends. However, this approach does not account for multimorbidity, where a single patient suffers from several chronic conditions at once.

"These patients are difficult to manage. What should be treated first? Where in the healthcare system should they go? Multimorbidity is a costly and complex challenge, and that's why we need to map the 'highways' of disease progression - the pathways most commonly followed by patients," says Søren Brunak.

The model has been trained on health data from the UK Biobank. It has learned from the disease trajectories and lifestyles of 400,000 participants and can recognise patterns in how their health evolves over time. This knowledge is used to predict the next likely disease.

Because some diseases follow more predictable patterns, the model is more accurate in forecasting diagnoses such as heart attacks, certain types of cancer, or sepsis, while conditions like pregnancy complications are more difficult to anticipate.

Towards More Precise and Targeted Treatment

Although the model is best suited for making predictions at the population level, it can still benefit individual patients by providing healthcare professionals with deeper insights into disease progression and interactions. This makes it easier to assess whether a patient is at increased risk and should receive more intensive treatment:

"The idea behind this model is also to project your disease trajectory so that the physician knows how aggressively to treat you from the outset. For some diabetes patients, lifestyle changes may be sufficient initially, while others should begin medication immediately," says Søren Brunak.

The reverse scenario also applies. Because it can be difficult to distinguish high-risk patients from others, many receive treatment that may not be necessary.

"The more we understand about disease progression, the better we can reduce unnecessary overtreatment," says Søren Brunak.

Still a Prototype - But Full of Potential

However, those expecting to see the method in clinical use will need to be patient. Søren Brunak emphasises that the model is still only a prototype:

"We wanted to explore whether it's possible to develop a method that can handle more than 1,000 diseases simultaneously. Our study shows that it is," he says.

To enable the model to predict not only the next disease but also subsequent ones, it must be trained on a larger dataset than the approximately 400,000 participants included in the initial study.

Nevertheless, the researchers are impressed by the accuracy of the model's predictions, says Laust Mortensen, Professor at the Department of Public Health, University of Copenhagen, and Research Professor at the Rockwool Foundation.

"There is great potential in our method. Although it was trained on British data, we have shown using Danish data that it can also be applied with high accuracy in Denmark to predict disease," says Laust Mortensen.

About the Study: How the Researchers Did It

The AI model was trained on health data from 400,000 participants in the UK Biobank, who consented to their data being used for research. Based on these data, the model learned to recognise patterns in participants' lifestyles and more than 1,000 diseases.

Subsequently, the model was transferred to Denmark without any data, where researchers tested the accuracy of its predictions using data from the Danish disease registry data as a control group.

In working with the Danish disease registry data, the researchers operated within a secure supercomputing environment under the Danish Health Data Authority, a public agency under the Danish Health Authority, and followed its security protocols.

The model was therefore developed using British data and tested using Danish data.

The project, funded in part by the Novo Nordisk Foundation, is the result of a collaboration between researchers from the University of Copenhagen, the European Molecular Biology Laboratory, Eberhard Karls University, the Robert Bosch Center for Tumor Disease, and the German Cancer Research Center.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.