Researchers from EMBL, LUMC, and collaborators reanalysed nearly 6,800 gut microbiome profiles, revealing a consistent microbial signature of colorectal cancer

Summary
- Researchers from EMBL, LUMC, and collaborators reanalysed nearly 6,800 gut microbiome profiles in colorectal cancer patients and controls.
- One of the largest single-disease gut microbiome meta-analyses to date, the new study identified a robust colorectal cancer microbiome signature that was consistent across populations, sequencing methods, and age-of-onset groups.
- A machine-learning classifier could distinguish colorectal cancer from non-cancer microbiomes across datasets.
- The colorectal cancer microbiome signature was linked to lower dietary fibre intake and could be reduced through fibre-focused dietary interventions.
Researchers have long suspected that the gut microbiome - the community of bacteria and other microorganisms living in the intestine - is closely linked to colorectal cancer. In a new study published in Cell Host & Microbe, an international group of researchers from the Mi-EOCRC consortium spanning Germany, Switzerland, and the Netherlands and including the Zeller and Zimmermann groups at EMBL Heidelberg, have carried out one of the most comprehensive analyses to date of the colorectal cancer-associated gut microbiome.
Many studies have reported differences between the microbiomes of people with colorectal cancer and those without the disease. But because these studies were often small and used different sequencing methods, it has been difficult to determine which microbial changes are truly reproducible. Meta-analyses - studies that aggregate data from multiple independent investigations - have attempted to address these reproducibility and consistency questions.
However, these had so far been based on only a fraction of the many datasets published to date.
Now, by reanalysing data from 27 studies, comprising 6,779 publicly available gut microbiome sequencing profiles, researchers have identified a robust microbial signature associated with colorectal cancer. The study also analysed 906 intestinal tissue samples to compare stool-based microbiome signals with microbes found directly in tumour tissue.
"The strength of this study is its comprehensiveness," said Georg Zeller, Visiting Team Leader at EMBL Heidelberg and Professor at the Leiden University Medical Center (LUMC) . "We combined stool and tissue comparisons, dietary data, taxonomic analysis down to bacterial strains, and functional analysis of virulence factors."
Making microbiome datasets comparable
A key advance of the study is methodological. The researchers developed and applied computational approaches that allowed them to integrate microbiome datasets generated using different sequencing methods at scale. Using these approaches, they could even re-analyse data from populations that were not originally recruited for the study of colorectal cancer.
"The key tool is a machine learning algorithm that is trained to distinguish cancer from non-cancer microbiomes," said Zeller. "It outputs a score of how 'cancer-like' a microbiome is. We can apply this to any existing human gut microbiome dataset, including from dietary intervention studies."
This approach enabled the researchers to establish a colorectal cancer microbiome signature that was not limited to one cohort, geography, sequencing method, or age of diagnosis. Instead, the signature appeared to be a broadly reproducible feature of the disease, including both early-onset and late-onset colorectal cancer.
Tumour microbes mirror stool-based cancer signatures
The team also investigated whether microbes detected in stool samples reflect microbes found directly within colorectal tumours. They found that microbes enriched in tumour tissue were similar to the colorectal cancer signature observed in faecal samples.
Importantly, in tissue samples, cancer-associated microbes could already be detected in early-stage tumours. In stool samples, however, the detection accuracy was somewhat lower in early-stage cancers and in tumours located further upstream in the colon. One possible explanation is that microbes from tumours that are smaller (due to more localised tumour growth in early stages) or further away from the rectum may be more difficult to detect in stool compared to tumours that are advanced or closer to the rectum.
"These results suggest that colorectal cancer-associated changes in the microbiome may appear early in disease development and raise the question of how the tumour shapes the microbiome and how the microbes impact the tumour microenvironment through signalling, metabolic, and other interactions," explained Michael Zimmermann , Group Leader at EMBL Heidelberg.
Pre-cancerous adenomas remain difficult to detect
While the colorectal cancer microbiome signature was robust, the researchers found that pre-cancerous adenomas remain difficult to detect in stool microbiome profiles. Adenoma-associated microbial changes were weaker than those seen in colorectal cancer and showed only limited overlap with the cancer-associated signature. Machine-learning classifiers trained to detect colorectal cancer, as well as ones trained specifically to distinguish adenomas from controls, showed variable performance across cohorts.
"This limitation is important for future clinical translation, which the Mi-EOCRC consortium is aiming at," said Michael Zimmermann. "It suggests that more sensitive approaches, larger datasets, or combinations with other measurements may be needed before microbiome-based tools could contribute to the reliable detection of early pre-cancerous lesions."
Dietary fibre linked to reduced cancer-like microbiome score
The researchers also explored how diet relates to the colorectal cancer microbiome signature. They found that a stronger cancer-associated microbiome pattern was linked to lower dietary fibre intake. Conversely, increasing dietary fibre intake in dietary intervention studies was associated with a reduction in the colorectal cancer microbiome signature score.
This finding suggests that diet, and particularly fibre consumption, can influence microbial patterns associated with colorectal cancer, supporting the idea that diet can shape gut microbial communities in ways that may be relevant to cancer risk, progression, or prevention.
Because the machine-learning score can be applied to existing microbiome datasets, including dietary intervention studies, the approach may help researchers better understand how lifestyle factors influence disease-associated microbiome patterns in studies beyond those covered in the current analysis.
Not all Fusobacteria are the same
The study also took a closer look at Fusobacterium, a group of bacteria repeatedly linked to colorectal cancer. By analysing hundreds of bacterial genomes from this group, the researchers found important differences between Fusobacterium subspecies.
Some subspecies carried different disease-related genes, including virulence factors, and some were more commonly enriched in colorectal cancer samples from particular geographic regions. In particular, Fusobacterium nucleatum subsp. animalis showed consistent colorectal cancer enrichment across continents, while other Fusobacterium species and subspecies showed more geographically heterogeneous patterns, with several of them almost exclusively found in cancer patients from Asia.
This level of resolution is important because bacteria grouped under the same genus can differ substantially in their biology and potential effects on human health.
Open science enables large-scale microbiome discovery
The work provides a comprehensive resource for understanding the microbiome's role in colorectal cancer and lays the groundwork for future studies into microbiome-based detection, risk assessment, and prevention strategies.
The findings could also provide an important foundation for future machine-learning-based tools. By defining a reproducible colorectal cancer-associated microbial signature, the study creates a reference that could be used to train and validate future models for risk assessment, early detection, or personalised prevention research.
However, the researchers emphasise that this is not yet a diagnostic test, but a step towards understanding how microbiome data could eventually support clinical research and decision-making. In comparisons with existing non-invasive screening approaches, microbiome-based classifiers did not yet match the performance of faecal immunochemical tests, and larger studies will be needed to assess whether microbiome data could complement or be combined with current clinical tests.
The study particularly highlights the power of open data and large-scale evidence synthesis in microbiome research. By combining thousands of publicly available microbiome profiles, the researchers were able to identify robust patterns that would have been difficult to detect in individual studies alone.