A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data

被引:12
作者
Cartwright, Reed A. [2 ]
Hussin, Julie [3 ]
Keebler, Jonathan E. M. [1 ]
Stone, Eric A. [1 ]
Awadalla, Philip [3 ]
机构
[1] N Carolina State Univ, Raleigh, NC 27695 USA
[2] Arizona State Univ, Tempe, AZ 85287 USA
[3] Univ Montreal, Montreal, PQ H3C 3J7, Canada
来源
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY | 2012年 / 11卷 / 02期
基金
美国国家卫生研究院;
关键词
de novo mutations; pedigree; short-read data; mutation rates; trio model; MAXIMUM-LIKELIHOOD; EM ALGORITHM; RATES; DNA; GENOTYPE;
D O I
10.2202/1544-6115.1713
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.
引用
收藏
页数:32
相关论文
共 22 条
  • [1] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [2] Direct Measure of the De Novo Mutation Rate in Autism and Schizophrenia Cohorts
    Awadalla, Philip
    Gauthier, Julie
    Myers, Rachel A.
    Casals, Ferran
    Hamdan, Fadi F.
    Griffing, Alexander R.
    Cote, Melanie
    Henrion, Edouard
    Spiegelman, Dan
    Tarabeux, Julien
    Piton, Amelie
    Yang, Yan
    Boyko, Adam
    Bustamante, Carlos
    Xiong, Lan
    Rapoport, Judith L.
    Addington, Aniene M.
    DeLisi, J. Lynn E.
    Krebs, Marie-Odile
    Joober, Ridha
    Millet, Bruno
    Fombonne, Eric
    Mottron, Laurent
    Zilversmit, Martine
    Keebler, Jon
    Daoud, Hussein
    Marineau, Claude
    Roy-Gagnon, Marie-Helene
    Dube, Marie-Pierre
    Eyre-Walker, Adam
    Drapeau, Pierre
    Stone, Eric A.
    Lafreniere, Ronald G.
    Rouleau, Guy A.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2010, 87 (03) : 316 - 324
  • [3] Variation in genome-wide mutation rates within and between human families
    Conrad, Donald F.
    Keebler, Jonathan E. M.
    DePristo, Mark A.
    Lindsay, Sarah J.
    Zhang, Yujun
    Casals, Ferran
    Idaghdour, Youssef
    Hartl, Chris L.
    Torroja, Carlos
    Garimella, Kiran V.
    Zilversmit, Martine
    Cartwright, Reed
    Rouleau, Guy A.
    Daly, Mark
    Stone, Eric A.
    Hurles, Matthew E.
    Awadalla, Philip
    [J]. NATURE GENETICS, 2011, 43 (07) : 712 - U137
  • [4] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [5] EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH
    FELSENSTEIN, J
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) : 368 - 376
  • [6] DATING OF THE HUMAN APE SPLITTING BY A MOLECULAR CLOCK OF MITOCHONDRIAL-DNA
    HASEGAWA, M
    KISHINO, H
    YANO, TA
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1985, 22 (02) : 160 - 174
  • [7] Human Triallelic Sites: Evidence for a New Mutational Mechanism?
    Hodgkinson, Alan
    Eyre-Walker, Adam
    [J]. GENETICS, 2010, 184 (01) : 233 - U376
  • [8] Jukes TH., 1969, MAMMALIAN PROTEIN ME, P21, DOI [DOI 10.1016/B978-1-4832-3211-9.50009-7, DOI 10.1093/BIOINFORMATICS/BTM404, 10.1016/B978-1-4832-3211-9.50009-7]
  • [10] Kingman J., 1982, Stochastic processes and their applications, V13, P235, DOI DOI 10.1016/0304-4149(82)90011-4