The more data from cancer research we pool, the better we can search for new treatments. But how can we keep patient data safe when sharing it? Leiden researchers will tackle this challenge in a major European project.
The project is the largest of EU-funded initiatives advancing the Mission on Cancer. It brings together 53 partners from 19 countries with a total budget of €30 million over five years. Researchers will build user-friendly tools, aiming to share data faster while protecting patient information. The ultimate goal is to improve cancer diagnosis and treatment in hospitals everywhere.
Connecting cancer research within Europe
'Until now, databases on cancer research have worked separately,' says Marco Spruit, one of the three Leiden researchers involved in the UNCAN-Connect project. 'Our goal is to federate all existing networks into one big collaborative network.'
In practice, this means data always stay on the hospital's own servers. Instead of moving sensitive patient information around, researchers share only the results of their analyses. This speeds up discoveries while honouring strict privacy rules.
Researchers share only the results of their analyses. This speeds up discoveries while honouring strict privacy rules.
Two key tools from Leiden
Leiden's team will build a data use search engine and an AI observatory. Kiana Shahrasbi, the PhD candidate working on this project, explains about the data search engine: 'We're creating a tool that reads thousands of medical papers in real time.' By using both traditional language techniques and large language models, the search engine spots details like patient age, ethnicity, or sample size buried in free-text articles. This helps scientists quickly find the right datasets for their studies.
Once an AI model is put into use at a hospital, we need to make sure it still works well. 'The observatory tracks model performance and checks that each patient group has at least ten members, as required by Dutch guidelines,' says Spruit. If the number of group members becomes too small or if the model starts working less accurately, for example if the data is becoming more different from the training data over time, the system issues a warning.
Studying large patient groups accelerates breakthroughs
Pooling data from many centres allows research on larger patient groups. This leads to stronger scientific conclusions and faster breakthroughs. Larger datasets help uncover rare trends, leading to better tests for early detection and more effective, personalised treatments.
'We are able to train AI models without ever moving patient data,' Spruit explains. 'It's privacy by design: models learn from local data and share only what they learn, never the raw data itself.'
Pooling data from many centres allows research on larger patient groups.
Helping European doctors deliver better cancer care
The project officially kicks off in October with a two-day workshop. Researchers will refine plans and set milestones. Marco Spruit, Armel Lefebvre, and Kiana Shahrasbi, are eager to start working on this international project. 'While efforts here in the Netherlands often center on national nodes, true progress happens above that: by connecting networks across borders,' Lefebvre says. 'Tackling those cross-country challenges is what motivates me.' Shahrasbi adds: 'With a foundation in computer science and AI, I've always wanted to apply my expertise to improve diagnosis and treatment. This project is precisely the kind of opportunity I've been looking for.'
With open-source code, transparent methods, and a diverse European consortium, the team is confident they can move beyond lab trials into everyday hospital use. By the end of the five years, they hope to have a fully operational platform that helps doctors across Europe deliver better cancer care.