About a dozen studies in the past five years have made claims linking nearly every type of human cancer with the presence of microbiomes, "communities" of bacteria, viruses and fungi that live in or on peoples' bodies. Now, scientists at Johns Hopkins Medicine say a study that sequenced human cancers found far less microbial DNA sequences than earlier studies reported in the same cancer tissue samples.
"It's the nature of science to validate, confirm and reproduce findings," says Steven Salzberg, Ph.D., Bloomberg Distinguished Professor of Biomedical Engineering, Computer Science, and Biostatistics at The Johns Hopkins University. "Over time, we see a more complete picture of new research, and in this case, we did not find any associations between microbiomes and many types of cancer."
Salzberg says details of the new study, published Sept. 3 in Science Translational Medicine, surveyed the whole genome sequences generated from 5,734 tissue samples collected from 25 cancer types and stored in a large National Cancer Institute-funded database, The Cancer Genome Atlas (TCGA). About half of the samples are from normal tissues and blood, the other half from solid tumors and blood-based cancers.
The TCGA's whole genome sequencing data contains millions of chopped up pieces of DNA molecules, known as reads, from each tissue sample. The original goal of the TCGA studies was to identify mutations in the DNA sequence of genes that might be associated with various cancer types. Sometimes, though, the original tumors might have microbes in them, and the reads could be used to identify those microbes.
Because reads often contain contaminants from bits of DNA left behind in sequencing machinery or picked up from the air or surfaces, samples can acquire DNA from those sources, as well as from the original tumor tissues. Salzberg says extraordinary efforts were made to identify such contaminants, preventing their study from displaying false results.
To rule out contaminants, Salzberg and his team relied on extensive experience with genomic sequencing and careful analysis of control samples to identify reads belonging to sequences known or highly likely to have contaminated samples.
For the current study, a continuation of one that the Johns Hopkins team published in 2023, Salzberg and first author Yuchen "Peter" Ge, a graduate student in biomedical engineering at Johns Hopkins, removed human DNA sequences from the TCGA data sets by mapping each read against two human reference genomes — one from the Telomere-to-Telomere (T2T) project and another from the Genome Reference Consortium.
After removing human DNA, the research team was left with, on average, 2.4 million reads per sample, or about 0.35% of the total 6.5 billion tumor sample reads. Of these, the research team found 323 million human DNA reads that weren't eliminated in the first pass and 986 million reads they classified as contaminants.
They next compared the remaining sequencing reads against a database containing 50,651 genomes representing 30,355 species of bacteria, viruses, fungi and archaea (single-celled organisms that aren't bacteria or viruses).
After removing human DNA sequences and contaminants, the average proportion of microbial DNA reads in solid tumor samples was 0.57% and 0.73% in blood cancers.
The Johns Hopkins researchers then compared their new results to a study published five years ago in the journal Nature [since retracted, because of concerns about contaminants in the microbial data], and found the previous study identified 56 times as many microbial reads as the new study for half of the total microbial reads. And 5% of the time, the previous study found 9,000 times the number of microbial reads as the current Johns Hopkins study. Salzberg says the microbial reads in the retracted study were highly likely to be contaminants.
"This disparity in the number of microbial reads didn't occur in just a few samples," says Salzberg. "Over the whole study, the previous researchers found far more microbial reads than we did."
In another comparison of a study published in Cell in 2022 and the current Johns Hopkins work, the 2022 study reported fungal DNA amounts that were hundreds of times more than what was found in the current Johns Hopkins study, largely due to contaminants.
Among the DNA samples in the current Johns Hopkins study, in which they did find microbiome DNA, the researchers found microbes that have long been linked with human cancer, such as HPV (linked with cervical and some head and neck cancers), Helicobacter pylori (linked to stomach cancer), and Fusobacterium nucleatum and Bacteroides fragilis (linked with GI cancers).
The current Johns Hopkins study and the previous ones published in Cell and Nature reported microbiomes of Saccharomyces cerevisiae, commonly known as baker's yeast. "It's one of the most common contaminants in sequencing labs," says Salzberg. They also found a virus that infects plant fungi, Rosellinia necatrix partitivirus 8, which has no known link to human disease.
Salzberg said the need to carefully document claims about the links between cancer and microbiomes is "especially important" as efforts ramp up to diagnose cancers early using microbiome information.
The Johns Hopkins researchers have made their sequencing analysis data available online to other scientists in the supplementary materials in Science Translational Medicine and online.
Other scientists who contributed to the work are Jennifer Lu, Daniela Puiu and Mahler Revsine from Johns Hopkins.