On computing the maximum parsimony score of a phylogenetic network

Mareike Fischer; Leo van Iersel; Steven Kelk; Celine Scornavacca

doi:10.1137/140959948

On computing the maximum parsimony score of a phylogenetic network

Mareike Fischer^*, Leo van Iersel, Steven Kelk, Celine Scornavacca

^*Corresponding author for this work

BioMathematics and BioInformatics

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Phylogenetic networks are used to display the relationship among different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as maximum parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of maximum parsimony on networks, "hardwired" and "softwired," and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem MULTITERMINAL CUT, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming. The software has been made freely available.

Original language	English
Pages (from-to)	559-585
Number of pages	27
Journal	Siam Journal on Discrete Mathematics
Volume	29
Issue number	1
DOIs	https://doi.org/10.1137/140959948
Publication status	Published - 2015

Access to Document

10.1137/140959948

Cite this

@article{e7bf2e964eb74c67b1613eacc80d82bf,

title = "On computing the maximum parsimony score of a phylogenetic network",

abstract = "Phylogenetic networks are used to display the relationship among different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as maximum parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of maximum parsimony on networks, {"}hardwired{"} and {"}softwired,{"} and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem MULTITERMINAL CUT, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming. The software has been made freely available.",

author = "Mareike Fischer and {van Iersel}, Leo and Steven Kelk and Celine Scornavacca",

year = "2015",

doi = "10.1137/140959948",

language = "English",

volume = "29",

pages = "559--585",

journal = "Siam Journal on Discrete Mathematics",

issn = "0895-4801",

publisher = "SIAM Publications",

number = "1",

}

TY - JOUR

T1 - On computing the maximum parsimony score of a phylogenetic network

AU - Fischer, Mareike

AU - van Iersel, Leo

AU - Kelk, Steven

AU - Scornavacca, Celine

PY - 2015

Y1 - 2015

N2 - Phylogenetic networks are used to display the relationship among different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as maximum parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of maximum parsimony on networks, "hardwired" and "softwired," and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem MULTITERMINAL CUT, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming. The software has been made freely available.

AB - Phylogenetic networks are used to display the relationship among different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as maximum parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of maximum parsimony on networks, "hardwired" and "softwired," and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem MULTITERMINAL CUT, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming. The software has been made freely available.

U2 - 10.1137/140959948

DO - 10.1137/140959948

M3 - Article

SN - 0895-4801

VL - 29

SP - 559

EP - 585

JO - Siam Journal on Discrete Mathematics

JF - Siam Journal on Discrete Mathematics

IS - 1

ER -