Computational problems in perfect phylogeny haplotyping: Typing without calling the allele

被引:5
作者
Barzuza, Tamar [1 ]
Beckmann, Jacques S. [2 ]
Shamir, Ron [3 ]
Pe'er, Itsik [4 ]
机构
[1] Weizmann Inst Sci, Dept Comp Sci & Appl Math, Rehovot, Israel
[2] Lausanne CHUV, Dept Med Genet, CH-1011 Lausanne, Switzerland
[3] Tel Aviv Univ, Sch Comp Sci, IL-69978 Tel Aviv, Israel
[4] Columbia Univ, Fu Fdn Sch Engn & Appl Sci, Dept Comp Sci, New York, NY 10027 USA
关键词
XOR-genotypes; haplotypes; perfect phylogeny; graph realization;
D O I
10.1109/TCBB.2007.1063
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A haplotype is an m-long binary vector. The XOR-genotype of two haplotypes is the m-vector of their coordinate-wise XOR. We study the following problem: Given a set of XOR-genotypes, reconstruct their haplotypes so that the set of resulting haplotypes can be mapped onto a perfect phylogeny (PP) tree. The question is motivated by studying population evolution in human genetics and is a variant of the PP haplotyping problem that has received intensive attention recently. Unlike the latter problem, in which the input is '' full '' genotypes, here, we assume less informative input and so may be more economical to obtain experimentally. Building on ideas of Gusfield, we show how to solve the problem in polynomial time by a reduction to the graph realization problem. The actual haplotypes are not uniquely determined by the tree they map onto and the tree itself may or may not be unique. We show that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness. To actually determine the haplotypes given the tree, additional information is necessary. We show that two or three full genotypes suffice to reconstruct all the haplotypes and present a linear algorithm for identifying those genotypes.
引用
收藏
页码:101 / 109
页数:9
相关论文
共 26 条
[1]  
ARKIN EM, 1992, UNPUB MULTIPLE CHOIC
[2]  
BAFNA V, 2002, CSE200221 UC DAV
[3]   Typing without calling the allele: a strategy for inferring SNP haplotypes [J].
Barzuza, T ;
Beckmann, JS ;
Shamir, R ;
Pe'er, I .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2005, 13 (08) :898-901
[4]  
Barzuza T, 2004, LECT NOTES COMPUT SC, V3109, P14
[5]   AN ALMOST LINEAR-TIME ALGORITHM FOR GRAPH REALIZATION [J].
BIXBY, RE ;
WAGNER, DK .
MATHEMATICS OF OPERATIONS RESEARCH, 1988, 13 (01) :99-123
[6]  
CLARK AG, 1990, MOL BIOL EVOL, V7, P111
[7]   High-resolution haplotype structure in the human genome [J].
Daly, MJ ;
Rioux, JD ;
Schaffner, SE ;
Hudson, TJ ;
Lander, ES .
NATURE GENETICS, 2001, 29 (02) :229-232
[8]  
DING Z, 2005, P 9 ANN INT C RES CO
[9]  
EASTBROOK G, 1975, MATH BIOSCI, V23, P263
[10]  
Eskin Eleazar, 2003, J Bioinform Comput Biol, V1, P1, DOI 10.1142/S0219720003000174