Integer programming approaches to haplotype inference by pure parsimony

被引:46
作者
Brown, DG [1 ]
Harrower, IM [1 ]
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
computations on discrete structures; integer programming; biology and genetics; haplotype inference;
D O I
10.1109/TCBB.2006.24
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In 2003, Gusfield introduced the Haplotype Inference by Pure Parsimony (HIPP) problem and presented an integer program (IP) that quickly solved many simulated instances of the problem [1]. Although it solved well on small instances, Gusfield's IP can be of exponential size in the worst case. Several authors [2], [3] have presented polynomial-sized IPs for the problem. In this paper, we further the work on IP approaches to HIPP. We extend the existing polynomial-sized IPs by introducing several classes of valid cuts for the IP. We also present a new polynomial-sized IP formulation that is a hybrid between two existing IP formulations and inherits many of the strengths of both. Many problems that are too complex for the exponential-sized formulations can still be solved in our new formulation in a reasonable amount of time. We provide a detailed empirical comparison of these IP formulations on both simulated and real genotype sequences. Our formulation can also be extended in a variety of ways to allow errors in the input or model the structure of the population under consideration.
引用
收藏
页码:141 / 154
页数:14
相关论文
共 27 条
  • [1] [Anonymous], BRANCH CUT ALGORITHM
  • [2] A note on efficient computation of haplotypes via perfect phylogeny
    Bafna, V
    Gusfield, D
    Hannenhalli, S
    Yooseph, S
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2004, 11 (05) : 858 - 866
  • [3] Haplotyping as perfect phylogeny: A direct approach
    Bafna, V
    Gusfield, D
    Lancia, G
    Yooseph, S
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (3-4) : 323 - 340
  • [4] Barzuza T, 2004, LECT NOTES COMPUT SC, V3109, P14
  • [5] FINDING ALL CLIQUES OF AN UNDIRECTED GRAPH [H]
    BRON, C
    KERBOSCH, J
    [J]. COMMUNICATIONS OF THE ACM, 1973, 16 (09) : 575 - 577
  • [6] Brown DG, 2004, LECT NOTES COMPUT SC, V3240, P254
  • [7] OPTIMALITY AND DEGENERACY IN LINEAR PROGRAMMING
    Charnes, A.
    [J]. ECONOMETRICA, 1952, 20 (02) : 160 - 170
  • [8] CLARK AG, 1990, MOL BIOL EVOL, V7, P111
  • [9] Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase
    Clark, AG
    Weiss, KM
    Nickerson, DA
    Taylor, SL
    Buchanan, A
    Stengård, J
    Salomaa, V
    Vartiainen, E
    Perola, M
    Boerwinkle, E
    Sing, CF
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 1998, 63 (02) : 595 - 612
  • [10] High-resolution haplotype structure in the human genome
    Daly, MJ
    Rioux, JD
    Schaffner, SE
    Hudson, TJ
    Lander, ES
    [J]. NATURE GENETICS, 2001, 29 (02) : 229 - 232