Improving data archiving practices in ancient genomics

Bergström, Anders ORCID: (2023) Improving data archiving practices in ancient genomics.

Full text not available from this repository. (Request a copy)


The sequencing of ancient DNA from preserved biological remains is producing a rich record of past genetic diversity in humans and other species. However, unless the primary data is made available in public archives in an appropriate fashion, its long-term value will not be fully realised. I surveyed publicly archived data from 28 recent ancient genomics studies. I found that half of the studies archived incomplete subsets of the generated genomic data, preventing accurate replication and representing a loss of data of potential use for future research. None of the studies met all archiving criteria that could be considered best practice. Based on these results, I make six recommendations for data producers: 1) archive all sequencing reads, not just those that can be aligned to a reference genome, 2) archive read alignments as well, but as secondary analysis files linked to the underlying raw read files, 3) provide correct experiment metadata on how samples, libraries and sequencing runs relate to each other, 4) provide informative sample metadata in the public archives, 5) publish and archive data from screening, low-coverage, poorly performing and negative experiments, and 6) review data archiving as part of peer review processes. Given the reliance on destructive sampling of finite material, I argue that ancient genomics studies have a particularly strong responsibility to ensure the longevity and reusability of generated data.

Item Type: Article
Faculty \ School: Faculty of Science > School of Biological Sciences
Depositing User: LivePure Connector
Date Deposited: 24 Oct 2023 00:37
Last Modified: 24 Oct 2023 00:37
DOI: 10.1101/2023.05.15.540553

Actions (login required)

View Item View Item