In recent years, a growing number of scientific investigations have posited an intriguing connection between human cancers and the microbiomes—the diverse communities of bacteria, viruses, and fungi inhabiting various tissues of the human body. These studies often reported the presence of microbial DNA within cancerous tissues, suggesting potential roles in cancer development or progression. However, a comprehensive new analysis led by researchers at Johns Hopkins Medicine challenges these assertions by revealing significantly fewer microbial DNA sequences in tumor samples than previously reported. This pivotal study, published in Science Translational Medicine, undertook a rigorous examination of thousands of cancer tissue samples using advanced genomic sequencing and contamination filtration techniques, ultimately painting a more cautious and nuanced picture of the cancer-microbiome relationship.
Biomedical engineer and computational biologist Dr. Steven Salzberg from Johns Hopkins University emphasizes the importance of scientific rigor and replication, noting, “It’s the nature of science to validate, confirm and reproduce findings.” Employing whole genome sequencing data organized within The Cancer Genome Atlas (TCGA), his team analyzed 5,734 samples spanning 25 different cancer types, accounting for both solid tumors and hematological malignancies in nearly equal proportions alongside matched normal tissue and blood controls. This enormous dataset allowed for a detailed and methodical re-assessment of microbial sequence presence in human cancers with unprecedented scope and precision.
The TCGA data consists of billions of short DNA fragments, known as ‘reads,’ which are sequences output by next-generation sequencing instruments. Originally gathered to identify mutations in human cancer genomes, these datasets inherently hold fragments of microbial DNA if present in the tumor tissue. Nevertheless, the accuracy of detecting such microbial sequences is complicated by the pervasive problem of contamination—artifacts introduced from laboratory reagents, environmental exposure, and sequencing platform residues. Dr. Salzberg’s team deployed sophisticated bioinformatics strategies designed to rigorously differentiate true microbial reads from contaminant noise, an endeavor fundamental to the new study’s reliability.
To accomplish this, researchers first meticulously filtered out human DNA sequences by mapping every read against two comprehensive human reference genomes: one assembled by the Telomere-to-Telomere (T2T) project, which provides an unprecedented level of completeness, and another from the Genome Reference Consortium. Despite the removal of human sequences, millions of reads initially thought to be microbial were identified as residual human DNA or contaminating sequences, underscoring the pervasive challenge contaminants pose in microbial mapping efforts.
Following this extensive cleaning process, approximately 0.35 percent of the total reads per sample remained as non-human, non-contaminant sequences. These were then compared against a vast database encompassing over 50,000 genomes spanning bacteria, viruses, fungi, and archaea. The final analysis found that microbial DNA constituted an average of merely 0.57 percent of total reads in solid tumor samples and 0.73 percent in blood cancers, percentages drastically lower than those reported by earlier investigations.
The study provides a direct comparison to a widely cited paper from five years prior published in Nature, which has since been retracted due to concerns over contamination. The previous Nature study reported microbial reads at an order of magnitude higher, with some samples containing up to 9,000 times more microbial sequences than observed by the Johns Hopkins team. Such discrepancies highlight the severe consequences of contamination in sequencing assays and stress the need for meticulous analytical standards.
Similarly, a 2022 study published in Cell identified vast quantities of fungal DNA across cancer samples—levels hundreds of times greater than those found by the Johns Hopkins group. Dr. Salzberg attributes these inflated estimates largely to contaminants, with yeast-like Saccharomyces cerevisiae, a known lab contaminant, appearing frequently in both past studies and the current analyses. Intriguingly, sequences of viruses infecting plant fungi, such as Rosellinia necatrix partitivirus 8, also appeared, signaling contamination from environmental sources unrelated to human pathology.
Despite detecting microbial DNA at lower levels, the Johns Hopkins researchers did confirm the presence of microbes with established links to human cancers, such as human papillomavirus (HPV), known for its role in cervical and certain head and neck cancers, Helicobacter pylori, implicated in stomach cancer, and bacterial species like Fusobacterium nucleatum and Bacteroides fragilis which have associations with gastrointestinal malignancies. This indicates that while the microbial burden may be smaller than previously suggested, bona fide microbial-cancer relationships remain scientifically valid and warrant further study.
Dr. Salzberg cautions that as the medical and scientific communities seek new diagnostic tools leveraging microbiome information to detect cancers at earlier stages, it is especially critical to ensure that findings attributing microbial presence to cancer are robust and reproducible. “Carefully documenting these associations with strict controls against contamination is essential,” he notes, highlighting the broader implications of the research for translational medicine and cancer diagnostics.
The Johns Hopkins team has made their data and analytical pipelines openly accessible through supplementary materials in Science Translational Medicine as well as on public repositories, fostering transparency and enabling other investigators to build upon or challenge their findings. Collaborators on the project include Jennifer Lu, Daniela Puiu, and Mahler Revsine, who contributed to the bioinformatics and computational analyses integral to the data validation efforts.
Funded by the National Institutes of Health, this study marks a critical recalibration in understanding the microbial compositions within cancer tissues and underscores the vital role of methodological rigor in next-generation sequencing studies. As the field moves forward, these findings invite a more cautious interpretation of the microbiome’s role in oncogenesis and highlight the ever-present challenges posed by contamination in high-throughput genomic data.
Subject of Research: Microbial DNA presence and its association with various human cancers.
Article Title: Not explicitly provided; inferred as “Reassessing Microbiome Presence in Human Cancers via Whole Genome Sequencing.”
News Publication Date: September 3, 2024.
Web References:
- Johns Hopkins Medicine news release
- Science Translational Medicine article DOI: 10.1126/scitranslmed.ads6335
- Supplemental data repository: https://zenodo.org/records/16544698
References:
- Previous Nature microbiome-cancer study (retracted)
- 2022 Cell fungal DNA study
- Johns Hopkins prior publication in mBio (2023)
Keywords: Cancer, microbiome, microbial DNA, contamination, sequencing, whole genome sequencing, next-generation sequencing, HPV, Helicobacter pylori, Fusobacterium nucleatum, Bacteroides fragilis, Saccharomyces cerevisiae