Diana, Alex, Matechou, Eleni, Griffin, Jim, Yu, Douglas W., Luo, Mingjie, Tosa, Marie, Bush, Alex and Griffiths, Richard (2024) eDNAPlus: A unifying modelling framework for DNA-based biodiversity monitoring. Journal of the American Statistical Association. ISSN 0162-1459
Preview |
PDF (2024 Diana et al eDNAPlus J Amer Stat Assoc)
- Accepted Version
Available under License Creative Commons Attribution. Download (18MB) | Preview |
Abstract
DNA-based biodiversity surveys involve collecting physical samples from survey sites and assaying the contents in the laboratory to detect species via their diagnostic DNA sequences. DNA-based surveys are increasingly being adopted for biodiversity monitoring. The most commonly employed method is metabarcoding, which combines PCR with high-throughput DNA sequencing to amplify and then read `DNA barcode' sequences. This process generates count data indicating the number of times each DNA barcode was read. However, DNA-based data are noisy and error-prone, with several sources of variation. In this paper, we present a unifying modelling framework for DNA-based data allowing for all key sources of variation and error in the data-generating process. The model can estimate within-species biomass changes across sites and link those changes to environmental covariates, while accounting for species and sites correlation. Inference is performed using MCMC, where we employ Gibbs or Metropolis-Hastings updates with Laplace approximations. We also implement a re-parameterisation scheme, appropriate for crossed-effects models, leading to improved mixing, and an adaptive approach for updating latent variables, reducing computation time. We discuss study design and present theoretical and simulation results to guide decisions on replication at different stages and on the use of quality control methods. We demonstrate the new framework on a dataset of Malaise-trap samples. We quantify the effects of elevation and distance-to-road on each species, infer species correlations, and produce maps identifying areas of high biodiversity, which can be used to rank areas by conservation value. We estimate the level of noise between sites and within sample replicates, and the probabilities of error at the PCR stage, which are close to zero for most species considered, validating the employed laboratory processing.
Item Type: | Article |
---|---|
Additional Information: | Data Availability Statement: The sequence data, bioinformatic scripts, and the three sample by species tables and environmental covariates are archived on DataDryad at doi.org/10.5061/dryad.4f4qrfjjb. Acknowledgments: The work was funded by NERC project NE/T010045/1 “Integrating new statistical frameworks into eDNA survey and analysis at the landscape scale” and benefited from the sCom Working Group at iDiv.de. DWY and MJL were supported by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDA20050202, the Key Research Program of Frontier Sciences, CAS (QYZDY-SSW-SMC024), the State Key Laboratory of Genetic Resources and Evolution (GREKF19-01, GREKF20-01, GREKF21-01) at the Kunming Institute of Zoology, and the University of Chinese Academy of Sciences. |
Faculty \ School: | Faculty of Science > School of Biological Sciences |
UEA Research Groups: | Faculty of Science > Research Centres > Centre for Ecology, Evolution and Conservation Faculty of Science > Research Groups > Organisms and the Environment |
Depositing User: | LivePure Connector |
Date Deposited: | 30 Sep 2024 15:30 |
Last Modified: | 12 Nov 2024 12:30 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/96834 |
DOI: | 10.1080/01621459.2024.2412362 |
Downloads
Downloads per month over past year
Actions (login required)
View Item |