Martin, Samuel, Holtgrefe, Niels, Moulton, Vincent and Leggett, Richard M. (2025) Algebraic invariants for inferring 4-leaf semi-directed phylogenetic networks. Systematic Biology. ISSN 1063-5157
Preview |
PDF (Algebraic-Invariants)
- Accepted Version
Download (4MB) | Preview |
Abstract
A core goal of phylogenomics is to determine the evolutionary history of a set of species from biological sequence data. Phylogenetic networks are able to describe more complex evolutionary phenomena than phylogenetic trees but are more difficult to accurately reconstruct. Recently, there has been growing interest in developing methods to infer semi-directed phylogenetic networks. As computing such networks can be computationally intensive, one approach to building such networks is to puzzle together smaller networks. Thus, it is essential to have robust methods for inferring semi-directed phylogenetic networks on small numbers of taxa. In this paper, we investigate an algebraic method for performing phylogenetic network inference from nucleotide sequence data on 4-leaf semi-directed phylogenetic networks by analysing the distribution of leaf-pattern probabilities. On simulated data, we found that we can correctly identify with high accuracy the undirected phylogenetic network for sequences of length at least 10kbp. We found that identifying the semi-directed network is more challenging and requires sequences of length approaching 10Mbp. We are also able to use our approach to identify tree-like evolution and determine the underlying tree. Finally, we employ our method on a real dataset from Xiphophorus species and use the results to build a phylogenetic network.
| Item Type: | Article |
|---|---|
| Additional Information: | Data availability statement: All scripts used in this project, including scripts for simulating data and Macaulay2 scripts for calculating invariants, are available at the GitHub repository https://github.com/SR-Martin/4cycle_invariants. All simulated data and results presented are available at https://doi.org/10.5061/dryad.44j0zpcrk. Funding information: This work was supported by EPSRC Mathematical Sciences Small Grant award EP/W007134/1. SM and RL acknowledge the support of the Biotechnology and Biological Sciences Research Council (BBSRC), part of UK Research and Innovation; parts of this research were funded by the BBSRC Core Strategic Programme Grant (Genomes to Food Security) BB/CSP1720/1 and its constituent work package BBS/E/T/000PR9817 (WP3 Computational Developments). SM is grateful for further funding from BBSRC (grant number BB/X005186/1). NH was supported by grant OCENW.M.21.306 of the Dutch Research Council (NWO). |
| Uncontrolled Keywords: | phylogenetic network,semi-directed network,phylogenetic invariants |
| Faculty \ School: | Faculty of Science > School of Computing Sciences Faculty of Science > School of Biological Sciences |
| UEA Research Groups: | Faculty of Science > Research Groups > Norwich Epidemiology Centre Faculty of Medicine and Health Sciences > Research Groups > Norwich Epidemiology Centre Faculty of Science > Research Groups > Computational Biology |
| Depositing User: | LivePure Connector |
| Date Deposited: | 20 Oct 2025 11:30 |
| Last Modified: | 21 Oct 2025 00:05 |
| URI: | https://ueaeprints.uea.ac.uk/id/eprint/100725 |
| DOI: | 10.1093/sysbio/syaf071 |
Downloads
Downloads per month over past year
Actions (login required)
![]() |
View Item |
Tools
Tools