Novel distances for Dollo data

Woodhams, Michael, Steane, Dorothy A., Jones, Rebecca C., Nicolle, Dean, Moulton, V. ORCID: https://orcid.org/0000-0001-9371-6435 and Holland, Barbara R. (2013) Novel distances for Dollo data. Systematic Biology, 62 (1). pp. 62-77. ISSN 1063-5157

Full text not available from this repository. (Request a copy)

Abstract

We investigate distances on binary (presence/absence) data in the context of a Dollo process, where a trait can only arise once on a phylogenetic tree but may be lost many times. We introduce a novel distance, the Additive Dollo Distance (ADD), that applies to data generated under a Dollo model and show that it has some useful theoretical properties including an intriguing link to the LogDet/paralinear distance. Simulations of Dollo data are used to compare a number of binary distances including ADD, LogDet, a restriction-site-based distance, and some simple, but to our knowledge previously unstudied, variations on common binary distances. The simulations suggest that ADD outperforms other distances on Dollo data. Interestingly, we found that the LogDet distance performs poorly in the context of a Dollo process; this may have implications for its use in connection with conditioned genome reconstruction. We apply the ADD to two Diversity Arrays Technology data sets, one that broadly covers Eucalyptus species and one that focuses on the Eucalyptus series Adnataria. We also reanalyze gene family presence/absence data from bacterial genomes obtained from the COG database and compare the results with previous phylogenies estimated using the conditioned genome reconstruction approach. The results for these case studies are largely congruent with previous studies, in some cases giving more phylogenetic resolution.

Item Type: Article
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Computational Biology > Computational biology of RNA (former - to 2018)
Faculty of Science > Research Groups > Computational Biology > Phylogenetics (former - to 2018)
Faculty of Science > Research Groups > Computational Biology
Faculty of Science > Research Groups > Norwich Epidemiology Centre
Faculty of Medicine and Health Sciences > Research Groups > Norwich Epidemiology Centre
Depositing User: Pure Connector
Date Deposited: 03 Feb 2014 11:18
Last Modified: 27 Oct 2023 01:15
URI: https://ueaeprints.uea.ac.uk/id/eprint/47441
DOI: 10.1093/sysbio/sys071

Actions (login required)

View Item View Item