Tractable cases of (*,2)-bounded parsimony haplotyping

J. Keijsper; T.S. Oosterwijk

doi:10.1109/TCBB.2014.2352031

Tractable cases of (*,2)-bounded parsimony haplotyping

J. Keijsper^*, T.S. Oosterwijk

^*Corresponding author for this work

Quantitative Economics

Research output: Contribution to journal › Article › Academic › peer-review

44 Downloads (Pure)

Abstract

Parsimony haplotyping is the problem of finding a set of haplotypes of minimum cardinality that explains a given set of genotypes, where a genotype is explained by two haplotypes if it can be obtained as a combination of the two. This problem is NP-complete in the general case, but polynomially solvable for ðk; lÞ-bounded instances for certain k and l. Here, k denotes the
maximum number of ambiguous sites in any genotype, and l is the maximum number of genotypes that are ambiguous at the same site. Only the complexity of the ð; 2Þ-bounded problem is still unknown, where denotes no restriction. It has been proved that ð; 2Þ-bounded instances have compatibility graphs that can be constructed from cliques and circuits by pasting along an edge. In this paper, we give a constructive proof of the fact that ð; 2Þ-bounded instances are polynomially solvable if the compatibility graph is constructed by pasting cliques, trees and circuits along a bounded number of edges. We obtain this proof by solving a slightly generalized problem on circuits, trees and cliques respectively, and arguing that all possible combinations of optimal solutions for these graphs that are pasted along a bounded number of edges can be enumerated efficiently.

Original language	English
Pages (from-to)	234-247
Journal	Ieee-Acm Transactions on Computational Biology and Bioinformatics
Volume	12
Issue number	1
DOIs	https://doi.org/10.1109/TCBB.2014.2352031
Publication status	Published - 1 Jan 2015

Access to Document

10.1109/TCBB.2014.2352031

Full textFinal published version, 267 KBLicence: Taverne

Cite this

@article{9c1c0747ce354f4faa5620a4f91e9a92,

title = "Tractable cases of (*,2)-bounded parsimony haplotyping",

abstract = "Parsimony haplotyping is the problem of finding a set of haplotypes of minimum cardinality that explains a given set of genotypes, where a genotype is explained by two haplotypes if it can be obtained as a combination of the two. This problem is NP-complete in the general case, but polynomially solvable for {\dh}k; l{\TH}-bounded instances for certain k and l. Here, k denotes themaximum number of ambiguous sites in any genotype, and l is the maximum number of genotypes that are ambiguous at the same site. Only the complexity of the {\dh}; 2{\TH}-bounded problem is still unknown, where denotes no restriction. It has been proved that {\dh}; 2{\TH}-bounded instances have compatibility graphs that can be constructed from cliques and circuits by pasting along an edge. In this paper, we give a constructive proof of the fact that {\dh}; 2{\TH}-bounded instances are polynomially solvable if the compatibility graph is constructed by pasting cliques, trees and circuits along a bounded number of edges. We obtain this proof by solving a slightly generalized problem on circuits, trees and cliques respectively, and arguing that all possible combinations of optimal solutions for these graphs that are pasted along a bounded number of edges can be enumerated efficiently.",

author = "J. Keijsper and T.S. Oosterwijk",

year = "2015",

month = jan,

day = "1",

doi = "10.1109/TCBB.2014.2352031",

language = "English",

volume = "12",

pages = "234--247",

journal = "Ieee-Acm Transactions on Computational Biology and Bioinformatics",

issn = "1545-5963",

publisher = "IEEE",

number = "1",

}

TY - JOUR

T1 - Tractable cases of (*,2)-bounded parsimony haplotyping

AU - Keijsper, J.

AU - Oosterwijk, T.S.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Parsimony haplotyping is the problem of finding a set of haplotypes of minimum cardinality that explains a given set of genotypes, where a genotype is explained by two haplotypes if it can be obtained as a combination of the two. This problem is NP-complete in the general case, but polynomially solvable for ðk; lÞ-bounded instances for certain k and l. Here, k denotes themaximum number of ambiguous sites in any genotype, and l is the maximum number of genotypes that are ambiguous at the same site. Only the complexity of the ð; 2Þ-bounded problem is still unknown, where denotes no restriction. It has been proved that ð; 2Þ-bounded instances have compatibility graphs that can be constructed from cliques and circuits by pasting along an edge. In this paper, we give a constructive proof of the fact that ð; 2Þ-bounded instances are polynomially solvable if the compatibility graph is constructed by pasting cliques, trees and circuits along a bounded number of edges. We obtain this proof by solving a slightly generalized problem on circuits, trees and cliques respectively, and arguing that all possible combinations of optimal solutions for these graphs that are pasted along a bounded number of edges can be enumerated efficiently.

AB - Parsimony haplotyping is the problem of finding a set of haplotypes of minimum cardinality that explains a given set of genotypes, where a genotype is explained by two haplotypes if it can be obtained as a combination of the two. This problem is NP-complete in the general case, but polynomially solvable for ðk; lÞ-bounded instances for certain k and l. Here, k denotes themaximum number of ambiguous sites in any genotype, and l is the maximum number of genotypes that are ambiguous at the same site. Only the complexity of the ð; 2Þ-bounded problem is still unknown, where denotes no restriction. It has been proved that ð; 2Þ-bounded instances have compatibility graphs that can be constructed from cliques and circuits by pasting along an edge. In this paper, we give a constructive proof of the fact that ð; 2Þ-bounded instances are polynomially solvable if the compatibility graph is constructed by pasting cliques, trees and circuits along a bounded number of edges. We obtain this proof by solving a slightly generalized problem on circuits, trees and cliques respectively, and arguing that all possible combinations of optimal solutions for these graphs that are pasted along a bounded number of edges can be enumerated efficiently.

U2 - 10.1109/TCBB.2014.2352031

DO - 10.1109/TCBB.2014.2352031

M3 - Article

C2 - 26357092

SN - 1545-5963

VL - 12

SP - 234

EP - 247

JO - Ieee-Acm Transactions on Computational Biology and Bioinformatics

JF - Ieee-Acm Transactions on Computational Biology and Bioinformatics

IS - 1

ER -