Gihawi, Abraham ORCID: https://orcid.org/0000-0002-3676-5561, Ge, Yuchen, Lu, Jennifer, Puiu, Daniela, Xu, Amanda, Cooper, Colin S. ORCID: https://orcid.org/0000-0003-2013-8042, Brewer, Daniel S. ORCID: https://orcid.org/0000-0003-4753-9794, Pertea, Mihaela and Salzberg, Steven L. (2023) Major data analysis errors invalidate cancer microbiome findings. mBIO, 14 (5). ISSN 2150-7511
Preview |
PDF (gihawi-et-al-2023-major-data-analysis-errors-invalidate-cancer-microbiome-findings)
- Published Version
Available under License Creative Commons Attribution. Download (973kB) | Preview |
Abstract
We re-analyzed the data from a recent large-scale study that reported strong correlations between DNA signatures of microbial organisms and 33 different cancer types and that created machine-learning predictors with near-perfect accuracy at distinguishing among cancers. We found at least two fundamental flaws in the reported data and in the methods: (i) errors in the genome database and the associated computational methods led to millions of false-positive findings of bacterial reads across all samples, largely because most of the sequences identified as bacteria were instead human; and (ii) errors in the transformation of the raw data created an artificial signature, even for microbes with no reads detected, tagging each tumor type with a distinct signal that the machine-learning programs then used to create an apparently accurate classifier. Each of these problems invalidates the results, leading to the conclusion that the microbiome-based classifiers for identifying cancer presented in the study are entirely wrong. These flaws have subsequently affected more than a dozen additional published studies that used the same data and whose results are likely invalid as well.
Item Type: | Article |
---|---|
Additional Information: | Funding Information: S.L.S., Y.G., J.L., D.P., and M.P. acknowledge the support from the U.S. NIH under grants R01 HG006677 and R35-GM130151. A.G., C.S.C., and D.S.B. acknowledge the support from Prostate Cancer UK (MA-ETNA19-003), Big C Cancer Charity (ref 16-09R), The Bob Champion Cancer Trust, and Cancer Research UK. |
Uncontrolled Keywords: | bioinformatics,cancer,computational biology,keywords microbiome,metagenomics,virology,microbiology,sdg 3 - good health and well-being ,/dk/atira/pure/subjectarea/asjc/2400/2406 |
Faculty \ School: | Faculty of Medicine and Health Sciences > Norwich Medical School |
UEA Research Groups: | Faculty of Medicine and Health Sciences > Research Groups > Cancer Studies Faculty of Medicine and Health Sciences > Research Centres > Metabolic Health |
Related URLs: | |
Depositing User: | LivePure Connector |
Date Deposited: | 03 Nov 2023 03:22 |
Last Modified: | 14 Nov 2023 11:16 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/93547 |
DOI: | 10.1128/mbio.01607-23 |
Downloads
Downloads per month over past year
Actions (login required)
View Item |