### Abstract

Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Recently, a new distance measure has been proposed: the Maximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. Here we show that computation of MP distance on two binary phylogenetic trees is NP-hard. This is a highly nontrivial extension of an earlier NP-hardness proof for two multifurcating phylogenetic trees, and it is particularly relevant given the prominence of binary trees in the phylogenetics literature. As a corollary to the main hardness result we show that computation of MP distance is also hard on binary trees if the number of states available is bounded. In fact, via a different reduction we show that it is hard even if only two states are available. Finally, as a first response to this hardness we give a simple Integer Linear Program (ILP) formulation which is capable of computing the MP distance exactly for small trees (and for larger trees when only a small number of character states are available) and which is used to computationally verify several auxiliary results required by the hardness proofs.

Original language | English |
---|---|

Pages (from-to) | 573-604 |

Number of pages | 32 |

Journal | Annals of Combinatorics |

Volume | 21 |

Issue number | 4 |

DOIs | |

Publication status | Published - 1 Dec 2017 |

### Keywords

- Maximum Parsimony
- phylogenetics
- tree metrics
- NP-hard
- binary trees
- COMPLETENESS
- EVOLUTION

### Cite this

*Annals of Combinatorics*,

*21*(4), 573-604. https://doi.org/10.1007/s00026-017-0361-1

}

*Annals of Combinatorics*, vol. 21, no. 4, pp. 573-604. https://doi.org/10.1007/s00026-017-0361-1

**On the Complexity of Computing MP Distance Between Binary Phylogenetic Trees.** / Kelk, Steven; Fischer, Mareike.

Research output: Contribution to journal › Article › Academic › peer-review

TY - JOUR

T1 - On the Complexity of Computing MP Distance Between Binary Phylogenetic Trees

AU - Kelk, Steven

AU - Fischer, Mareike

PY - 2017/12/1

Y1 - 2017/12/1

N2 - Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Recently, a new distance measure has been proposed: the Maximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. Here we show that computation of MP distance on two binary phylogenetic trees is NP-hard. This is a highly nontrivial extension of an earlier NP-hardness proof for two multifurcating phylogenetic trees, and it is particularly relevant given the prominence of binary trees in the phylogenetics literature. As a corollary to the main hardness result we show that computation of MP distance is also hard on binary trees if the number of states available is bounded. In fact, via a different reduction we show that it is hard even if only two states are available. Finally, as a first response to this hardness we give a simple Integer Linear Program (ILP) formulation which is capable of computing the MP distance exactly for small trees (and for larger trees when only a small number of character states are available) and which is used to computationally verify several auxiliary results required by the hardness proofs.

AB - Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Recently, a new distance measure has been proposed: the Maximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. Here we show that computation of MP distance on two binary phylogenetic trees is NP-hard. This is a highly nontrivial extension of an earlier NP-hardness proof for two multifurcating phylogenetic trees, and it is particularly relevant given the prominence of binary trees in the phylogenetics literature. As a corollary to the main hardness result we show that computation of MP distance is also hard on binary trees if the number of states available is bounded. In fact, via a different reduction we show that it is hard even if only two states are available. Finally, as a first response to this hardness we give a simple Integer Linear Program (ILP) formulation which is capable of computing the MP distance exactly for small trees (and for larger trees when only a small number of character states are available) and which is used to computationally verify several auxiliary results required by the hardness proofs.

KW - Maximum Parsimony

KW - phylogenetics

KW - tree metrics

KW - NP-hard

KW - binary trees

KW - COMPLETENESS

KW - EVOLUTION

U2 - 10.1007/s00026-017-0361-1

DO - 10.1007/s00026-017-0361-1

M3 - Article

VL - 21

SP - 573

EP - 604

JO - Annals of Combinatorics

JF - Annals of Combinatorics

SN - 0218-0006

IS - 4

ER -