Improving data archiving practices in ancient genomics

Bergström, Anders ORCID: https://orcid.org/0000-0002-4096-9268 (2024) Improving data archiving practices in ancient genomics. Scientific Data, 11. ISSN 2052-4463

[thumbnail of Bergstrom_2024_SciData]
Preview
PDF (Bergstrom_2024_SciData) - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

Ancient DNA is producing a rich record of past genetic diversity in humans and other species. However, unless the primary data is appropriately archived, its long-term value will not be fully realised. I surveyed publicly archived data from 42 recent ancient genomics studies. Half of the studies archived incomplete datasets, preventing accurate replication and representing a loss of data of potential future use. No studies met all criteria that could be considered best practice. Based on these results, I make six recommendations for data producers: (1) archive all sequencing reads, not just those that aligned to a reference genome, (2) archive read alignments too, but as secondary analysis files, (3) provide correct experiment metadata on samples, libraries and sequencing runs, (4) provide informative sample metadata, (5) archive data from low-coverage and negative experiments, and (6) document archiving choices in papers, and peer review these. Given the reliance on destructive sampling of finite material, ancient genomics studies have a particularly strong responsibility to ensure the longevity and reusability of generated data.

Item Type: Article
Additional Information: Data availability statement: The data analysed in this study was obtained from the following archive study accessions: PRJEB54831 (ENA)8,94, PRJEB51180 (ENA)9,95, HRA001777 (GSA)10,96, PRJEB52849 (ENA)11,97, PRJEB42656 (ENA)12,98, PRJEB52230 (ENA)13,99, SRP352154 (SRA)14,100, SRP455000 (SRA)15,101, PRJEB51440 (ENA)16,102, PRJEB56773 (ENA)17,103, PRJEB49291 (ENA)18,104, PRJEB42781 (ENA)19,105, PRJEB46734 (ENA)20,106, SRP356017 (SRA)21,107, PRJEB42269 (ENA)22,108, PRJEB47891 (ENA)23,109, PRJEB54899 (ENA)24,110, PRJEB43715 (ENA)25,111, PRJEB39134 (ENA)26,112, PRJEB46162 (ENA)27,113, PRJEB46875 (ENA)28,114, PRJEB44430 (ENA)29,115, PRJEB42199 (ENA)30,116, PRJEB38555 (ENA)31,117, PRJEB55327 (ENA)32,118, PRJEB56213 (ENA)33,119, PRJEB51862 (ENA)34,120, PRJEB58698 (ENA)35,121, PRJEB62503 (ENA)36,122, PRJEB66319 (ENA)37,123, PRJEB59008 (ENA)38,124, PRJEB61818 (ENA)39,125, PRJEB50368 (ENA)40,126, PRJEB50857 (ENA)41,127, HRA000451 (GSA)42,128, HRA000411 (GSA)43,129, PRJEB53475 (ENA)44,130, PRJEB37782 (ENA)45,131, SRP299553 (SRA)46,132, PRJEB42372 (ENA)47,133, PRJEB66422 (ENA)48,134, PRJEB57364 (ENA)49,135. Code availability: No custom code was written for this paper.
Faculty \ School: Faculty of Science > School of Biological Sciences
UEA Research Groups: Faculty of Science > Research Centres > Centre for Ecology, Evolution and Conservation
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 24 Oct 2023 00:37
Last Modified: 01 Nov 2024 09:30
URI: https://ueaeprints.uea.ac.uk/id/eprint/93419
DOI: 10.1101/2023.05.15.540553

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item