Finding haplotype block boundaries by using the minimum-description-length principle

被引:63
作者
Anderson, EC [1 ]
Novembre, J [1 ]
机构
[1] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
关键词
D O I
10.1086/377106
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We present a method for detecting haplotype blocks that simultaneously uses information about linkage-disequilibrium decay between the blocks and the diversity of haplotypes within the blocks. By use of phased single-nucleotide polymorphism data, our method partitions a chromosome into a series of adjacent, nonoverlapping blocks. The partition is made by choosing among a family of Markov models for block structure in a chromosomal region. Specifically, in the model, the occurrence of haplotypes within blocks follows a time-inhomogeneous Markov process along the chromosome, and we choose among possible partitions by using the two-stage minimum-description-length criterion. When applied to data simulated from the coalescent with recombination hotspots, our method reliably situates block boundaries at the hotspots and infrequently places block boundaries at sites with background levels of recombination. We apply three previously published block-finding methods to the same data, showing that they either are relatively insensitive to recombination hotspots or fail to discriminate between background sites of recombination and hotspots. When applied to the 5q31 data of Daly et al., our method identifies more block boundaries in agreement with those found by Daly et al. than do other methods. These results suggest that our method may be useful for designing association-based mapping studies that exploit haplotype blocks.
引用
收藏
页码:336 / 354
页数:19
相关论文
共 30 条
  • [1] Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
  • [2] High-resolution patterns of meiotic recombination across the human major histocompatibility complex
    Cullen, M
    Perfetto, SP
    Klitz, W
    Nelson, G
    Carrington, M
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (04) : 759 - 776
  • [3] High-resolution haplotype structure in the human genome
    Daly, MJ
    Rioux, JD
    Schaffner, SE
    Hudson, TJ
    Lander, ES
    [J]. NATURE GENETICS, 2001, 29 (02) : 229 - 232
  • [4] The structure of haplotype blocks in the human genome
    Gabriel, SB
    Schaffner, SF
    Nguyen, H
    Moore, JM
    Roy, J
    Blumenstiel, B
    Higgins, J
    DeFelice, M
    Lochner, A
    Faggart, M
    Liu-Cordero, SN
    Rotimi, C
    Adeyemo, A
    Cooper, R
    Ward, R
    Lander, ES
    Daly, MJ
    Altshuler, D
    [J]. SCIENCE, 2002, 296 (5576) : 2225 - 2229
  • [5] Islands of linkage disequilibrium
    Goldstein, DB
    [J]. NATURE GENETICS, 2001, 29 (02) : 109 - 111
  • [6] GREENSPAN G, 2003, 7 ANN INT C RES COMP
  • [7] Model selection and the principle of minimum description length
    Hansen, MH
    Yu, B
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) : 746 - 774
  • [8] Generating samples under a Wright-Fisher neutral model of genetic variation
    Hudson, RR
    [J]. BIOINFORMATICS, 2002, 18 (02) : 337 - 338
  • [9] HUDSON RR, 1985, GENETICS, V111, P147
  • [10] Mitochondrial genome variation and the origin of modern humans
    Ingman, M
    Kaessmann, H
    Pääbo, S
    Gyllensten, U
    [J]. NATURE, 2000, 408 (6813) : 708 - 713