A Resolution of the Static Formulation Question for the Problem of Computing the History Bound

Julia Matsieva, Steven Kelk, Celine Scornavacca, Chris Whidden, Dan Gusfield

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Evolutionary data has been traditionally modeled via phylogenetic trees; however, branching alone cannot model conflicting phylogenetic signals, so networks are used instead. Ancestral recombination graphs (ARGs) are used to model the evolution of incompatible sets of SNP data, allowing each site to mutate only once. The model often aims to minimize the number of recombinations. Similarly, incompatible cluster data can be represented by a reticulation network that minimizes reticulation events. The ARG literature has traditionally been disjoint from the reticulation network literature. By building on results from the reticulation network literature, we resolve an open question of interest to the ARG community. We explicitly prove that the History Bound, a lower bound on the number of recombinations in an ARG for a binary matrix, which was previously only defined procedurally, is equal to the minimum number of reticulation nodes in a network for the corresponding cluster data. To facilitate the proof, we give an algorithm that constructs this network using intermediate values from the procedural History Bound definition. We then develop a top-down algorithm for computing the History Bound, which has the same worst-case runtime as the known dynamic program, and show that it is likely to run faster in typical cases.

Original languageEnglish
Pages (from-to)404-417
Number of pages14
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume14
Issue number2
DOIs
Publication statusPublished - 1 Mar 2017

Keywords

  • Rooted phylogenetic networks
  • clusters
  • reticulate evolution
  • parsimony
  • computational complexity
  • algorithms
  • MINIMUM NUMBER
  • RECOMBINATION EVENTS
  • INFERENCE

Cite this