The complexity of comparing multiply-labelled trees by extending phylogenetic-tree metrics

Lafond, Manuel, El-Mabrouk, Nadia, Huber, Katharina T. and Moulton, Vincent ORCID: https://orcid.org/0000-0001-9371-6435 (2019) The complexity of comparing multiply-labelled trees by extending phylogenetic-tree metrics. Theoretical Computer Science, 760. pp. 15-34. ISSN 0304-3975

[thumbnail of Accepted manuscript]
Preview
PDF (Accepted manuscript) - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (554kB) | Preview

Abstract

A multilabeled tree (or MUL-tree) is a rooted tree in which every leaf is labelled by an element from some set, but in which more than one leaf may be labelled by the same element of that set. In phylogenetics, such trees are used in biogeographical studies, to study the evolution of gene families, and also within approaches to construct phylogenetic networks. A multilabelled tree in which no leaf-labels are repeated is called a phylogenetic tree, and one in which every label is the same is also known as a tree-shape. In this paper, we consider the complexity of computing metrics on MUL-trees that are obtained by extending metrics on phylogenetic trees. In particular, by restricting our attention to tree shapes, we show that computing the metric extension on MUL-trees is NP-complete for two well-known metrics on phylogenetic trees, namely, the path-difference and Robinson Foulds distances. We also show that the extension of the Robinson Foulds distance is fixed parameter tractable with respect to the distance parameter. The path distance complexity result allows us to also answer an open problem concerning the complexity of solving the quadratic assignment problem for two matrices that are a Robinson similarity and a Robinson dissimilarity, which we show to be NP-complete. We conclude by considering the maximum agreement subtree (MAST) distance on phylogenetic trees to MUL-trees. Although its extension to MUL-trees can be computed in polynomial time, we show that computing its natural generalization to more than two MUL-trees is NP-complete, although fixed-parameter tractable in the maximum degree when the number of given trees is bounded.

Item Type: Article
Uncontrolled Keywords: tree shape,multilabelled tree,phylogenetic tree,np-hardness,fixed-parameter tractability,multilabelled tree,tree shape,phylogenetic tree,fixed-parameter tractability,theoretical computer science,computer science(all) ,/dk/atira/pure/subjectarea/asjc/2600/2614
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Computational Biology
Faculty of Science > Research Groups > Computational Biology > Phylogenetics (former - to 2018)
Faculty of Science > Research Groups > Norwich Epidemiology Centre
Faculty of Medicine and Health Sciences > Research Groups > Norwich Epidemiology Centre
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 07 Aug 2018 10:30
Last Modified: 14 Jun 2023 13:29
URI: https://ueaeprints.uea.ac.uk/id/eprint/67964
DOI: 10.1016/j.tcs.2018.08.006

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item