Genome-Wide Inference of Ancestral Recombination Graphs

被引:187
|
作者
Rasmussen, Matthew D. [1 ]
Hubisz, Melissa J. [1 ]
Gronau, Ilan [1 ]
Siepel, Adam [1 ,2 ]
机构
[1] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY USA
[2] European Bioinformat Inst, European Mol Biol Lab, Hinxton, Cambs, England
来源
PLOS GENETICS | 2014年 / 10卷 / 05期
关键词
CONDITIONAL SAMPLING DISTRIBUTION; MAXIMUM-LIKELIHOOD; LINKAGE-DISEQUILIBRIUM; DELETERIOUS MUTATIONS; POPULATION-GENETICS; NATURAL-SELECTION; EVOLUTIONARY TREES; SEQUENCES SUBJECT; DNA-SEQUENCES; GENOTYPE DATA;
D O I
10.1371/journal.pgen.1004342
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the "ancestral recombination graph" (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally intensive, highly approximate, or limited to small numbers of sequences, and, as a consequence, explicit ARG inference is rarely used in applied population genomics. Here, we introduce a new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes. The key idea of our approach is to sample an ARG of n chromosomes conditional on an ARG of n-1 chromosomes, an operation we call "threading." Using techniques based on hidden Markov models, we can perform this threading operation exactly, up to the assumptions of the sequentially Markov coalescent and a discretization of time. An extension allows for threading of subtrees instead of individual sequences. Repeated application of these threading operations results in highly efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these methods in a computer program called ARGweaver. Experiments with simulated data indicate that ARGweaver converges rapidly to the posterior distribution over ARGs and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations. In applications of ARGweaver to 54 human genome sequences from Complete Genomics, we find clear signatures of natural selection, including regions of unusually ancient ancestry associated with balancing selection and reductions in allele age in sites under directional selection. The patterns we observe near protein-coding genes are consistent with a primary influence from background selection rather than hitchhiking, although we cannot rule out a contribution from recurrent selective sweeps.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] A general and efficient representation of ancestral recombination graphs
    Wong, Yan
    Ignatieva, Anastasia
    Koskela, Jere
    Gorjanc, Gregor
    Wohns, Anthony W.
    Kelleher, Jerome
    GENETICS, 2024, 228 (01)
  • [2] Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits
    Zhang, Brian C.
    Biddanda, Arjun
    Gunnarsson, Arni Freyr
    Cooper, Fergus
    Palamara, Pier Francesco
    NATURE GENETICS, 2023, 55 (05) : 768 - +
  • [3] Genome-wide variation in recombination rate in Eucalyptus
    Gion, Jean-Marc
    Hudson, Corey J.
    Lesur, Isabelle
    Vaillancourt, Rene E.
    Potts, Brad M.
    Freeman, Jules S.
    BMC GENOMICS, 2016, 17
  • [4] Evolution and Plasticity of Genome-Wide Meiotic Recombination Rates
    Henderson, Ian R.
    Bomblies, Kirsten
    ANNUAL REVIEW OF GENETICS, VOL 55, 2021, 55 : 23 - 43
  • [5] KwARG: parsimonious reconstruction of ancestral recombination graphs with recurrent mutation
    Ignatieva, Anastasia
    Lyngso, Rune B.
    Jenkins, Paul A.
    Hein, Jotun
    BIOINFORMATICS, 2021, 37 (19) : 3277 - 3284
  • [6] Building minimum recombination ancestral recombination graphs for whole genomes
    Nguyen Thi Phuong Thao
    Le Sy Vinh
    2017 4TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2017, : 248 - 253
  • [7] Assessing Differences Between Ancestral Recombination Graphs
    Kuhner, Mary K.
    Yamato, Jon
    JOURNAL OF MOLECULAR EVOLUTION, 2015, 80 (5-6) : 258 - 264
  • [8] The distribution of waiting distances in ancestral recombination graphs
    Deng, Yun
    Song, Yun S.
    Nielsen, Rasmus
    THEORETICAL POPULATION BIOLOGY, 2021, 141 : 34 - 43
  • [9] Exploring Population Admixture Dynamics via Empirical and Simulated Genome-wide Distribution of Ancestral Chromosomal Segments
    Jin, Wenfei
    Wang, Sijia
    Wang, Haifeng
    Jin, Li
    Xu, Shuhua
    AMERICAN JOURNAL OF HUMAN GENETICS, 2012, 91 (05) : 849 - 862
  • [10] Ancestral inference from samples of DNA sequences with recombination
    Griffiths, RC
    Marjoram, P
    JOURNAL OF COMPUTATIONAL BIOLOGY, 1996, 3 (04) : 479 - 502