Linked-read sequencing enables haplotype-resolved resequencing at population scale

Lutgen, Dave, Ritter, Raphael, Olsen, Remi-André, Schielzeth, Holger, Gruselius, Joel, Ewels, Phil, García, Jesús T, Shirihai, Hadoram, Schweizer, Manuel, Suh, Alexander and Burri, Reto (2020) Linked-read sequencing enables haplotype-resolved resequencing at population scale. Molecular Ecology Resources, 20 (5). pp. 1311-1322. ISSN 1755-098X

[img] PDF (Accepted_Manuscript) - Submitted Version
Restricted to Repository staff only until 18 May 2021.

Download (12MB) | Request a copy
[img]
Preview
PDF (Published_Version) - Published Version
Available under License Creative Commons Attribution Non-commercial.

Download (1MB) | Preview

Abstract

The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences - including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps - are still limited by the lack of high-quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of phased sequences located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing at population scale.

Item Type: Article
Additional Information: This article is protected by copyright. All rights reserved.
Uncontrolled Keywords: admixture,demography,introgression,phasing,population genomics,selective sweeps,biotechnology,ecology, evolution, behavior and systematics,genetics ,/dk/atira/pure/subjectarea/asjc/1300/1305
Faculty \ School: Faculty of Science > School of Biological Sciences
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 26 Jun 2020 00:02
Last Modified: 28 Oct 2020 01:11
URI: https://ueaeprints.uea.ac.uk/id/eprint/75778
DOI: 10.1111/1755-0998.13192

Actions (login required)

View Item View Item