Maximum parsimony distance on phylogenetic trees: A linear kernel and constant factor approximation algorithm

Mark Jones; Steven Kelk; Leen Stougie

doi:10.1016/j.jcss.2020.10.003

Maximum parsimony distance on phylogenetic trees: A linear kernel and constant factor approximation algorithm

Mark Jones^*, Steven Kelk, Leen Stougie

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetic trees. It is NP-hard to compute, and very few positive algorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixed parameter tractable. We do this by establishing a linear kernel i.e., that after applying certain reduction rules the resulting instance has size that is bounded by a linear function of the distance. As powerful corollaries to this result we prove that the problem permits a polynomial-time constant factor approximation algorithm; that the treewidth of a natural auxiliary graph structure encountered in phylogenetics is bounded by a function of the distance; and that the distance is within a constant factor of the size of a maximum agreement forest of the two trees, a well studied object in phylogenetics. (C) 2020 The Author(s). Published by Elsevier Inc.

Original language	English
Pages (from-to)	165-181
Number of pages	17
Journal	Journal of Computer and System Sciences
Volume	117
DOIs	https://doi.org/10.1016/j.jcss.2020.10.003
Publication status	Published - May 2021

Keywords

Phylogenetics
Maximum parsimony
Fixed parameter tractability
Maximum agreement forest
AGREEMENT FOREST
COMPATIBILITY
COMPLEXITY

Access to Document

10.1016/j.jcss.2020.10.003Licence: CC BY

Cite this

@article{d9294f358b1742cea4db36cabf9c7d63,

title = "Maximum parsimony distance on phylogenetic trees: A linear kernel and constant factor approximation algorithm",

abstract = "Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetic trees. It is NP-hard to compute, and very few positive algorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixed parameter tractable. We do this by establishing a linear kernel i.e., that after applying certain reduction rules the resulting instance has size that is bounded by a linear function of the distance. As powerful corollaries to this result we prove that the problem permits a polynomial-time constant factor approximation algorithm; that the treewidth of a natural auxiliary graph structure encountered in phylogenetics is bounded by a function of the distance; and that the distance is within a constant factor of the size of a maximum agreement forest of the two trees, a well studied object in phylogenetics. (C) 2020 The Author(s). Published by Elsevier Inc.",

keywords = "Phylogenetics, Maximum parsimony, Fixed parameter tractability, Maximum agreement forest, AGREEMENT FOREST, COMPATIBILITY, COMPLEXITY",

author = "Mark Jones and Steven Kelk and Leen Stougie",

year = "2021",

month = may,

doi = "10.1016/j.jcss.2020.10.003",

language = "English",

volume = "117",

pages = "165--181",

journal = "Journal of Computer and System Sciences",

issn = "0022-0000",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Maximum parsimony distance on phylogenetic trees

T2 - A linear kernel and constant factor approximation algorithm

AU - Jones, Mark

AU - Kelk, Steven

AU - Stougie, Leen

PY - 2021/5

Y1 - 2021/5

N2 - Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetic trees. It is NP-hard to compute, and very few positive algorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixed parameter tractable. We do this by establishing a linear kernel i.e., that after applying certain reduction rules the resulting instance has size that is bounded by a linear function of the distance. As powerful corollaries to this result we prove that the problem permits a polynomial-time constant factor approximation algorithm; that the treewidth of a natural auxiliary graph structure encountered in phylogenetics is bounded by a function of the distance; and that the distance is within a constant factor of the size of a maximum agreement forest of the two trees, a well studied object in phylogenetics. (C) 2020 The Author(s). Published by Elsevier Inc.

AB - Maximum parsimony distance is a measure used to quantify the dissimilarity of two unrooted phylogenetic trees. It is NP-hard to compute, and very few positive algorithmic results are known due to its complex combinatorial structure. Here we address this shortcoming by showing that the problem is fixed parameter tractable. We do this by establishing a linear kernel i.e., that after applying certain reduction rules the resulting instance has size that is bounded by a linear function of the distance. As powerful corollaries to this result we prove that the problem permits a polynomial-time constant factor approximation algorithm; that the treewidth of a natural auxiliary graph structure encountered in phylogenetics is bounded by a function of the distance; and that the distance is within a constant factor of the size of a maximum agreement forest of the two trees, a well studied object in phylogenetics. (C) 2020 The Author(s). Published by Elsevier Inc.

KW - Phylogenetics

KW - Maximum parsimony

KW - Fixed parameter tractability

KW - Maximum agreement forest

KW - AGREEMENT FOREST

KW - COMPATIBILITY

KW - COMPLEXITY

U2 - 10.1016/j.jcss.2020.10.003

DO - 10.1016/j.jcss.2020.10.003

M3 - Article

SN - 0022-0000

VL - 117

SP - 165

EP - 181

JO - Journal of Computer and System Sciences

JF - Journal of Computer and System Sciences

ER -