Convex Characters, Algorithms, and Matchings

Steven Kelk; Ruben Meuwese; Stephan Wagner

doi:10.1137/21M1463999

Convex Characters, Algorithms, and Matchings

Steven Kelk, Ruben Meuwese, Stephan Wagner

Research output: Contribution to journal › Article › Academic › peer-review

1 Downloads (Pure)

Abstract

Abstract. Phylogenetic trees are used to model evolution: leaves are labeled to represent contemporary species (“taxa”), and interior vertices represent extinct ancestors. Informally, convex characters are measurements on the contemporary species in which the subset of species (both contemporary and extinct) that share a given state form a connected subtree. Kelk and Stamoulis [Adv. Appl. Math., 84 (2017), pp. 34–46] showed how to efficiently count, list, and sample certain restricted subfamilies of convex characters, and algorithmic applications were given. We continue this work in a number of directions. First, we show how combining the enumeration of convex characters with existing parameterized algorithms can be used to speed up exponential-time algorithms for the maximum agreement forest problem in phylogenetics. Second, we revisit the quantity g2(T), defined as the number of convex characters on T in which each state appears on at least 2 taxa. We use this to give an algorithm with running time O( n cdot poly(n) ), where phi .6181 is the golden ratio and n is the number of taxa in the input trees for computation of maximum parsimony distance on two state characters. By further restricting the characters counted by g2(T) we open an interesting bridge to the literature on enumeration of matchings. By crossing this bridge we improve the running time of the aforementioned parsimony distance algorithm to O( 1.5895^n cdot poly(n) ) and obtain a number of new results in themselves relevant to enumeration of matchings on at most binary trees.

Original language	English
Pages (from-to)	380-411
Number of pages	32
Journal	Siam Journal on Discrete Mathematics
Volume	38
Issue number	1
DOIs	https://doi.org/10.1137/21M1463999
Publication status	Published - 2024

Access to Document

10.1137/21M1463999

Embargoed Document

Full Text
Final published version, 671 KB
Licence: Taverne
Embargo ends: 1/07/24
Request copy

Cite this

@article{0bd157fac7c64f0f9b474d5eb946abf9,

title = "Convex Characters, Algorithms, and Matchings",

abstract = "Abstract. Phylogenetic trees are used to model evolution: leaves are labeled to represent contemporary species (“taxa”), and interior vertices represent extinct ancestors. Informally, convex characters are measurements on the contemporary species in which the subset of species (both contemporary and extinct) that share a given state form a connected subtree. Kelk and Stamoulis [Adv. Appl. Math., 84 (2017), pp. 34–46] showed how to efficiently count, list, and sample certain restricted subfamilies of convex characters, and algorithmic applications were given. We continue this work in a number of directions. First, we show how combining the enumeration of convex characters with existing parameterized algorithms can be used to speed up exponential-time algorithms for the maximum agreement forest problem in phylogenetics. Second, we revisit the quantity g2(T), defined as the number of convex characters on T in which each state appears on at least 2 taxa. We use this to give an algorithm with running time O( n cdot poly(n) ), where phi .6181 is the golden ratio and n is the number of taxa in the input trees for computation of maximum parsimony distance on two state characters. By further restricting the characters counted by g2(T) we open an interesting bridge to the literature on enumeration of matchings. By crossing this bridge we improve the running time of the aforementioned parsimony distance algorithm to O( 1.5895^n cdot poly(n) ) and obtain a number of new results in themselves relevant to enumeration of matchings on at most binary trees.",

author = "Steven Kelk and Ruben Meuwese and Stephan Wagner",

year = "2024",

doi = "10.1137/21M1463999",

language = "English",

volume = "38",

pages = "380--411",

journal = "Siam Journal on Discrete Mathematics",

issn = "0895-4801",

publisher = "SIAM Publications",

number = "1",

}

TY - JOUR

T1 - Convex Characters, Algorithms, and Matchings

AU - Kelk, Steven

AU - Meuwese, Ruben

AU - Wagner, Stephan

PY - 2024

Y1 - 2024

N2 - Abstract. Phylogenetic trees are used to model evolution: leaves are labeled to represent contemporary species (“taxa”), and interior vertices represent extinct ancestors. Informally, convex characters are measurements on the contemporary species in which the subset of species (both contemporary and extinct) that share a given state form a connected subtree. Kelk and Stamoulis [Adv. Appl. Math., 84 (2017), pp. 34–46] showed how to efficiently count, list, and sample certain restricted subfamilies of convex characters, and algorithmic applications were given. We continue this work in a number of directions. First, we show how combining the enumeration of convex characters with existing parameterized algorithms can be used to speed up exponential-time algorithms for the maximum agreement forest problem in phylogenetics. Second, we revisit the quantity g2(T), defined as the number of convex characters on T in which each state appears on at least 2 taxa. We use this to give an algorithm with running time O( n cdot poly(n) ), where phi .6181 is the golden ratio and n is the number of taxa in the input trees for computation of maximum parsimony distance on two state characters. By further restricting the characters counted by g2(T) we open an interesting bridge to the literature on enumeration of matchings. By crossing this bridge we improve the running time of the aforementioned parsimony distance algorithm to O( 1.5895^n cdot poly(n) ) and obtain a number of new results in themselves relevant to enumeration of matchings on at most binary trees.

AB - Abstract. Phylogenetic trees are used to model evolution: leaves are labeled to represent contemporary species (“taxa”), and interior vertices represent extinct ancestors. Informally, convex characters are measurements on the contemporary species in which the subset of species (both contemporary and extinct) that share a given state form a connected subtree. Kelk and Stamoulis [Adv. Appl. Math., 84 (2017), pp. 34–46] showed how to efficiently count, list, and sample certain restricted subfamilies of convex characters, and algorithmic applications were given. We continue this work in a number of directions. First, we show how combining the enumeration of convex characters with existing parameterized algorithms can be used to speed up exponential-time algorithms for the maximum agreement forest problem in phylogenetics. Second, we revisit the quantity g2(T), defined as the number of convex characters on T in which each state appears on at least 2 taxa. We use this to give an algorithm with running time O( n cdot poly(n) ), where phi .6181 is the golden ratio and n is the number of taxa in the input trees for computation of maximum parsimony distance on two state characters. By further restricting the characters counted by g2(T) we open an interesting bridge to the literature on enumeration of matchings. By crossing this bridge we improve the running time of the aforementioned parsimony distance algorithm to O( 1.5895^n cdot poly(n) ) and obtain a number of new results in themselves relevant to enumeration of matchings on at most binary trees.

U2 - 10.1137/21M1463999

DO - 10.1137/21M1463999

M3 - Article

SN - 0895-4801

VL - 38

SP - 380

EP - 411

JO - Siam Journal on Discrete Mathematics

JF - Siam Journal on Discrete Mathematics

IS - 1

ER -