Finding a most parsimonious or likely tree in a network with respect to an alignment

Steven Kelk, Fabio Pardi, Celine Scornavacca, Leo Van Iersel

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

Original languageEnglish
Pages (from-to)527-547
Number of pages21
JournalJournal of Mathematical Biology
Volume78
Issue number1-2
DOIs
Publication statusPublished - 1 Jan 2019

Keywords

  • Phylogenetic tree
  • Phylogenetic network
  • Maximum parsimony
  • Maximum likelihood
  • NP-hardness
  • APX-hardness
  • MAXIMUM-LIKELIHOOD
  • BAYESIAN-INFERENCE
  • RECOMBINATION
  • EVOLUTION
  • MODEL

Cite this

Kelk, Steven ; Pardi, Fabio ; Scornavacca, Celine ; Van Iersel, Leo. / Finding a most parsimonious or likely tree in a network with respect to an alignment. In: Journal of Mathematical Biology. 2019 ; Vol. 78, No. 1-2. pp. 527-547.
@article{0cebea1f8ecc487cba6d5a746e664c54,
title = "Finding a most parsimonious or likely tree in a network with respect to an alignment",
abstract = "Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.",
keywords = "Phylogenetic tree, Phylogenetic network, Maximum parsimony, Maximum likelihood, NP-hardness, APX-hardness, MAXIMUM-LIKELIHOOD, BAYESIAN-INFERENCE, RECOMBINATION, EVOLUTION, MODEL",
author = "Steven Kelk and Fabio Pardi and Celine Scornavacca and {Van Iersel}, Leo",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s00285-018-1282-2",
language = "English",
volume = "78",
pages = "527--547",
journal = "Journal of Mathematical Biology",
issn = "0303-6812",
publisher = "Springer Verlag",
number = "1-2",

}

Finding a most parsimonious or likely tree in a network with respect to an alignment. / Kelk, Steven; Pardi, Fabio; Scornavacca, Celine; Van Iersel, Leo.

In: Journal of Mathematical Biology, Vol. 78, No. 1-2, 01.01.2019, p. 527-547.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Finding a most parsimonious or likely tree in a network with respect to an alignment

AU - Kelk, Steven

AU - Pardi, Fabio

AU - Scornavacca, Celine

AU - Van Iersel, Leo

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

AB - Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

KW - Phylogenetic tree

KW - Phylogenetic network

KW - Maximum parsimony

KW - Maximum likelihood

KW - NP-hardness

KW - APX-hardness

KW - MAXIMUM-LIKELIHOOD

KW - BAYESIAN-INFERENCE

KW - RECOMBINATION

KW - EVOLUTION

KW - MODEL

U2 - 10.1007/s00285-018-1282-2

DO - 10.1007/s00285-018-1282-2

M3 - Article

VL - 78

SP - 527

EP - 547

JO - Journal of Mathematical Biology

JF - Journal of Mathematical Biology

SN - 0303-6812

IS - 1-2

ER -