A genomic mutational constraint map using variation in 76,156 human genomes

被引:513
作者
Chen, Siwei [1 ,2 ]
Francioli, Laurent C. [1 ,2 ]
Goodrich, Julia K. [1 ]
Collins, Ryan L. [1 ,3 ,4 ]
Kanai, Masahiro [1 ,2 ]
Wang, Qingbo [1 ,5 ]
Alfoldi, Jessica [1 ,2 ]
Watts, Nicholas A. [1 ,2 ]
Vittal, Christopher [1 ,2 ]
Gauthier, Laura D. [6 ]
Poterba, Timothy [1 ,2 ,7 ]
Wilson, Michael W. [1 ,2 ]
Tarasova, Yekaterina [1 ]
Phu, William [1 ,8 ]
Grant, Riley [1 ]
Yohannes, Mary T. [1 ]
Koenig, Zan [2 ,7 ]
Farjoun, Yossi [9 ]
Banks, Eric [6 ]
Donnelly, Stacey [10 ]
Gabriel, Stacey [11 ]
Gupta, Namrata [1 ,11 ]
Ferriera, Steven [11 ]
Tolonen, Charlotte [6 ]
Novod, Sam [6 ]
Bergelson, Louis [6 ]
Roazen, David [6 ]
Ruano-Rubio, Valentin [6 ]
Covarrubias, Miguel [6 ]
Llanwarne, Christopher [6 ]
Petrillo, Nikelle [6 ]
Wade, Gordon [6 ]
Jeandet, Thibault [6 ]
Munshi, Ruchi [6 ]
Tibbetts, Kathleen [6 ]
O'Donnell-Luria, Anne [1 ,3 ,8 ]
Solomonson, Matthew [1 ,2 ]
Seed, Cotton [2 ,7 ]
Martin, Alicia R. [1 ,2 ,7 ]
Talkowski, Michael E. [1 ,3 ,7 ]
Rehm, Heidi L. [1 ,3 ]
Daly, Mark J. [1 ,2 ,12 ]
Tiao, Grace [1 ,2 ]
Neale, Benjamin M. [1 ,2 ]
MacArthur, Daniel G. [1 ,13 ,14 ,15 ]
Karczewski, Konrad J. [1 ,2 ,7 ]
Abreu, Maria [16 ]
Aguilar Salinas, Carlos A. [17 ]
Ahmad, Tariq [18 ]
Albert, Christine M. [19 ,20 ,21 ]
机构
[1] Broad Inst MIT & Harvard, Program Med & Populat Genet, Cambridge, MA 02142 USA
[2] Massachusetts Gen Hosp, Analyt & Translat Genet Unit, Boston, MA 02114 USA
[3] Massachusetts Gen Hosp, Ctr Genom Med, Boston, MA 02114 USA
[4] Harvard Med Sch, Div Med Sci, Boston, MA 02115 USA
[5] Osaka Univ, Grad Sch Med, Dept Stat Genet, Suita, Osaka, Japan
[6] Broad Inst MIT & Harvard, Data Sci Platform, Cambridge, MA 02142 USA
[7] Broad Inst MIT & Harvard, Stanley Ctr Psychiat Res, Cambridge, MA 02142 USA
[8] Boston Childrens Hosp, Div Genet & Genom, Boston, MA USA
[9] Lady Davis Inst, Richards Lab, Montreal, PQ, Canada
[10] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[11] Broad Inst MIT & Harvard, Broad Genom, Cambridge, MA 02142 USA
[12] Inst Mol Med Finland FIMM, Helsinki, Finland
[13] Garvan Inst Med Res, Ctr Populat Genom, Sydney, NSW, Australia
[14] UNSW Sydney, Sydney, NSW, Australia
[15] Murdoch Childrens Res Inst, Ctr Populat Genom, Melbourne, Vic, Australia
[16] Univ Miami, Miller Sch Med, Gastroenterol, Miami, FL 33136 USA
[17] Inst Nacl Ciencias Med & Nutr Salvador Zubiran, Unidad Invest Enfermedades Metabol, Mexico City, DF, Mexico
[18] Peninsula Coll Med & Dent, Exeter, Devon, England
[19] Brigham & Womens Hosp, Div Prevent Med, 75 Francis St, Boston, MA 02115 USA
[20] Brigham & Womens Hosp, Div Cardiovasc Med, 75 Francis St, Boston, MA 02115 USA
[21] Harvard Med Sch, Boston, MA 02115 USA
[22] Univ Hosp, Dept Cardiol, Parma, Italy
[23] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[24] Univ Haifa, Dept Biol, Fac Nat Sci, Haifa, Israel
[25] Albert Einstein Coll Med, Dept Med, Bronx, NY 10467 USA
[26] Albert Einstein Coll Med, Dept Genet, Bronx, NY 10467 USA
[27] Cleveland Clin, Dept Quantitat Hlth Sci, Lerner Res Inst, Cleveland, OH 44106 USA
[28] Sorbonne Univ, St Antoine Hosp, AP HP, Gastroenterol Dept, Paris, France
[29] NHLBI, Framingham Heart Study, Framingham, MA USA
[30] Boston Univ, Framingham, MA USA
[31] Boston Univ, Sch Med, Chobanian & Avedisian Sch Med, Boston, MA 02118 USA
[32] Boston Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA USA
[33] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[34] Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[35] NHGRI, NIH, Bethesda, MD 20892 USA
[36] Icahn Sch Med Mt Sinai, Charles Bronfman Inst Personalized Med, New York, NY 10029 USA
[37] Wake Forest Sch Med, Dept Biochem, Winston Salem, NC 27101 USA
[38] Wake Forest Sch Med, Ctr Genom & Personalized Med Res, Winston Salem, NC 27101 USA
[39] Wake Forest Sch Med, Ctr Diabet Res, Winston Salem, NC 27101 USA
[40] Univ Leicester, Dept Cardiovasc Sci, Leicester, Leics, England
[41] Univ Leicester, NIHR Leicester Biomed Res Ctr, Leicester, Leics, England
[42] Glenfield Hosp, NIHR Leicester Biomed Res Ctr, Leicester, Leics, England
[43] Massachusetts Gen Hosp, Dept Neurol, Boston, MA USA
[44] Rutgers State Univ, Rutgers Robert Wood Johnson Med Sch, Dept Med, New Brunswick, NJ USA
[45] Rutgers State Univ, Sch Arts & Sci, Dept Genet, Piscataway, NJ USA
[46] Rutgers State Univ, Human Genet Inst New Jersey, Piscataway, NJ USA
[47] Johns Hopkins Univ, Sch Med, Meyerhoff Inflammatory Bowel Dis Ctr, Baltimore, MD USA
[48] Fulcrum Genom, Boulder, CO USA
[49] Harvard Sch Publ Hlth, Boston, MA USA
[50] Cent Amer Populat Ctr, San Pedro, Costa Rica
基金
英国惠康基金; 美国国家卫生研究院; 英国医学研究理事会;
关键词
REGULATORY ELEMENTS; DOMINANCE ANALYSIS; VARIANTS; GENES; PLASMINOGEN; EXPRESSION; ASCL2; IDENTIFICATION; ANNOTATION; PREDICTORS;
D O I
10.1038/s41586-023-06045-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation. A genomic constraint map for the human genome constructed using data from 76,156 human genomes from the Genome Aggregation Database shows that non-coding constrained regions are enriched for regulatory elements and variants associated with complex diseases and traits.
引用
收藏
页码:92 / 100
页数:27
相关论文
共 86 条
[1]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[2]   Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder [J].
An, Joon-Yong ;
Lin, Kevin ;
Zhu, Lingxue ;
Werling, Donna M. ;
Dong, Shan ;
Brand, Harrison ;
Wang, Harold Z. ;
Zhao, Xuefang ;
Schwartz, Grace B. ;
Collins, Ryan L. ;
Currall, Benjamin B. ;
Dastmalchi, Claudia ;
Dea, Jeanselle ;
Duhn, Clif ;
Gilson, Michael C. ;
Klei, Lambertus ;
Liang, Lindsay ;
Markenscoff-Papadimitriou, Eirene ;
Pochareddy, Sirisha ;
Ahituv, Nadav ;
Buxbaum, Joseph D. ;
Coon, Hilary ;
Daly, Mark J. ;
Kim, Young Shin ;
Marth, Gabor T. ;
Neale, Benjamin M. ;
Quinlan, Aaron R. ;
Rubenstein, John L. ;
Sestan, Nenad ;
State, Matthew W. ;
Willsey, A. Jeremy ;
Talkowski, Michael E. ;
Devlin, Bernie ;
Roeder, Kathryn ;
Sanders, Stephan J. .
SCIENCE, 2018, 362 (6420) :1270-+
[3]   An atlas of active enhancers across human cell types and tissues [J].
Andersson, Robin ;
Gebhard, Claudia ;
Miguel-Escalada, Irene ;
Hoof, Ilka ;
Bornholdt, Jette ;
Boyd, Mette ;
Chen, Yun ;
Zhao, Xiaobei ;
Schmidl, Christian ;
Suzuki, Takahiro ;
Ntini, Evgenia ;
Arner, Erik ;
Valen, Eivind ;
Li, Kang ;
Schwarzfischer, Lucia ;
Glatz, Dagmar ;
Raithel, Johanna ;
Lilje, Berit ;
Rapin, Nicolas ;
Bagger, Frederik Otzen ;
Jorgensen, Mette ;
Andersen, Peter Refsing ;
Bertin, Nicolas ;
Rackham, Owen ;
Burroughs, A. Maxwell ;
Baillie, J. Kenneth ;
Ishizu, Yuri ;
Shimizu, Yuri ;
Furuhata, Erina ;
Maeda, Shiori ;
Negishi, Yutaka ;
Mungall, Christopher J. ;
Meehan, Terrence F. ;
Lassmann, Timo ;
Itoh, Masayoshi ;
Kawaji, Hideya ;
Kondo, Naoto ;
Kawai, Jun ;
Lennartsson, Andreas ;
Daub, Carsten O. ;
Heutink, Peter ;
Hume, David A. ;
Jensen, Torben Heick ;
Suzuki, Harukazu ;
Hayashizaki, Yoshihide ;
Mueller, Ferenc ;
Forrest, Alistair R. R. ;
Carninci, Piero ;
Rehli, Michael ;
Sandelin, Albin .
NATURE, 2014, 507 (7493) :455-+
[4]   The dominance analysis approach for comparing predictors in multiple regression [J].
Azen, R ;
Budescu, DV .
PSYCHOLOGICAL METHODS, 2003, 8 (02) :129-148
[5]   Identification of the Fourth Duplication of Upstream IHH Regulatory Elements, in a Family with Craniosynostosis Philadelphia Type, Helps to Define the Phenotypic Characterization of These Regulatory Elements [J].
Barroso, Eva ;
Berges-Soria, Julia ;
Benito-Sanz, Sara ;
Ivan Rivera-Pedroza, Carlos ;
Juliana Ballesta-Martinez, Maria ;
Lopez-Gonzalez, Vanesa ;
Guillen-Navarro, Encarna ;
Heath, Karen E. .
AMERICAN JOURNAL OF MEDICAL GENETICS PART A, 2015, 167 (04) :902-906
[6]   Metazoan MicroRNAs [J].
Bartel, David P. .
CELL, 2018, 173 (01) :20-51
[7]   Insights into human genetic variation and population history from 929 diverse genomes [J].
Bergstrom, Anders ;
McCarthy, Shane A. ;
Hui, Ruoyun ;
Almarri, Mohamed A. ;
Ayub, Qasim ;
Danecek, Petr ;
Chen, Yuan ;
Felkel, Sabine ;
Hallast, Pille ;
Kamm, Jack ;
Blanche, Helene ;
Deleuze, Jean-Francois ;
Cann, Howard ;
Mallick, Swapan ;
Reich, David ;
Sandhu, Manjinder S. ;
Skoglund, Pontus ;
Scally, Aylwyn ;
Xue, Yali ;
Durbin, Richard ;
Tyler-Smith, Chris .
SCIENCE, 2020, 367 (6484) :1339-+
[8]   The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics [J].
Blake, Judith A. ;
Bult, Carol J. ;
Kadin, James A. ;
Richardson, Joel E. ;
Eppig, Janan T. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D842-D848
[9]   DOMINANCE ANALYSIS - A NEW APPROACH TO THE PROBLEM OF RELATIVE IMPORTANCE OF PREDICTORS IN MULTIPLE-REGRESSION [J].
BUDESCU, DV .
PSYCHOLOGICAL BULLETIN, 1993, 114 (03) :542-551
[10]   Refining analyses of copy number variation identifies specific genes associated with developmental delay [J].
Coe, Bradley P. ;
Witherspoon, Kali ;
Rosenfeld, Jill A. ;
van Bon, Bregje W. M. ;
Vulto-van Silfhout, Anneke T. ;
Bosco, Paolo ;
Friend, Kathryn L. ;
Baker, Carl ;
Buono, Serafino ;
Vissers, Lisenka E. L. M. ;
Schuurs-Hoeijmakers, Janneke H. ;
Hoischen, Alex ;
Pfundt, Rolph ;
Krumm, Nik ;
Carvill, Gemma L. ;
Li, Deana ;
Amaral, David ;
Brown, Natasha ;
Lockhart, Paul J. ;
Scheffer, Ingrid E. ;
Alberti, Antonino ;
Shaw, Marie ;
Pettinato, Rosa ;
Tervo, Raymond ;
de Leeuw, Nicole ;
Reijnders, Margot R. F. ;
Torchia, Beth S. ;
Peeters, Hilde ;
O'Roak, Brian J. ;
Fichera, Marco ;
Hehir-Kwa, Jayne Y. ;
Shendure, Jay ;
Mefford, Heather C. ;
Haan, Eric ;
Gecz, Jozef ;
de Vries, Bert B. A. ;
Romano, Corrado ;
Eichler, Evan E. .
NATURE GENETICS, 2014, 46 (10) :1063-1071