The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

被引:100
作者
Petrovski, Slave [1 ,2 ,3 ]
Gussow, Ayal B. [1 ,2 ,4 ]
Wang, Quanli [1 ,2 ]
Halvorsen, Matt [1 ,2 ]
Han, Yujun [2 ]
Weir, William H. [2 ]
Allen, Andrew S. [2 ,5 ]
Goldstein, David B. [1 ,2 ]
机构
[1] Columbia Univ, Inst Genom Med, New York, NY 10027 USA
[2] Duke Univ, Sch Med, Ctr Human Genome Variat, Durham, NC USA
[3] Univ Melbourne, Dept Med, Austin Hlth & Royal Melbourne Hosp, Melbourne, Vic, Australia
[4] Duke Univ, Program Computat Biol & Bioinformat, Durham, NC USA
[5] Duke Univ, Dept Biostat & Bioinformat, Durham, NC USA
来源
PLOS GENETICS | 2015年 / 11卷 / 09期
基金
美国国家卫生研究院;
关键词
DE-NOVO MUTATIONS; FRAMEWORK; SCHIZOPHRENIA; ANNOTATION; DISCOVERY; VARIANTS; PATTERNS; DATABASE; DISEASE; HUMANS;
D O I
10.1371/journal.pgen.1005492
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, ncCADD and ncGWAVA, and find both scores are significantly predictive of human dosage sensitive genes and appear to carry information beyond conservation, as assessed by ncGERP. These results highlight that the intolerance of noncoding sequence stretches in the human genome can provide a critical complementary tool to other genome annotation approaches to help identify the parts of the human genome increasingly likely to harbor mutations that influence risk of disease.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] Pathogenic variants in non-protein-coding sequences
    Makrythanasis, P.
    Antonarakis, S. E.
    [J]. CLINICAL GENETICS, 2013, 84 (05) : 422 - 428
  • [32] Patterns and rates of exonic de novo mutations in autism spectrum disorders
    Neale, Benjamin M.
    Kou, Yan
    Liu, Li
    Ma'ayan, Avi
    Samocha, Kaitlin E.
    Sabo, Aniko
    Lin, Chiao-Feng
    Stevens, Christine
    Wang, Li-San
    Makarov, Vladimir
    Polak, Paz
    Yoon, Seungtai
    Maguire, Jared
    Crawford, Emily L.
    Campbell, Nicholas G.
    Geller, Evan T.
    Valladares, Otto
    Schafer, Chad
    Liu, Han
    Zhao, Tuo
    Cai, Guiqing
    Lihm, Jayon
    Dannenfelser, Ruth
    Jabado, Omar
    Peralta, Zuleyma
    Nagaswamy, Uma
    Muzny, Donna
    Reid, Jeffrey G.
    Newsham, Irene
    Wu, Yuanqing
    Lewis, Lora
    Han, Yi
    Voight, Benjamin F.
    Lim, Elaine
    Rossin, Elizabeth
    Kirby, Andrew
    Flannick, Jason
    Fromer, Menachem
    Shakir, Khalid
    Fennell, Tim
    Garimella, Kiran
    Banks, Eric
    Poplin, Ryan
    Gabriel, Stacey
    DePristo, Mark
    Wimbish, Jack R.
    Boone, Braden E.
    Levy, Shawn E.
    Betancur, Catalina
    Sunyaev, Shamil
    [J]. NATURE, 2012, 485 (7397) : 242 - U129
  • [33] Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations
    O'Roak, Brian J.
    Vives, Laura
    Girirajan, Santhosh
    Karakoc, Emre
    Krumm, Niklas
    Coe, Bradley P.
    Levy, Roie
    Ko, Arthur
    Lee, Choli
    Smith, Joshua D.
    Turner, Emily H.
    Stanaway, Ian B.
    Vernot, Benjamin
    Malig, Maika
    Baker, Carl
    Reilly, Beau
    Akey, Joshua M.
    Borenstein, Elhanan
    Rieder, Mark J.
    Nickerson, Deborah A.
    Bernier, Raphael
    Shendure, Jay
    Eichler, Evan E.
    [J]. NATURE, 2012, 485 (7397) : 246 - U136
  • [34] Loss of Wdfy3 in mice alters cerebral cortical neurogenesis reflecting aspects of the autism pathology
    Orosco, Lori A.
    Ross, Adam P.
    Cates, Staci L.
    Scott, Sean E.
    Wu, Dennis
    Sohn, Jiho
    Pleasure, David
    Pleasure, Samuel J.
    Adamopoulos, Iannis E.
    Zarbalis, Konstantinos S.
    [J]. NATURE COMMUNICATIONS, 2014, 5
  • [35] Genic Intolerance to Functional Variation and the Interpretation of Personal Genomes
    Petrovski, Slave
    Wang, Quanli
    Heinzen, Erin L.
    Allen, Andrew S.
    Goldstein, David B.
    [J]. PLOS GENETICS, 2013, 9 (08):
  • [36] The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes
    Pruitt, Kim D.
    Harrow, Jennifer
    Harte, Rachel A.
    Wallin, Craig
    Diekhans, Mark
    Maglott, Donna R.
    Searle, Steve
    Farrell, Catherine M.
    Loveland, Jane E.
    Ruef, Barbara J.
    Hart, Elizabeth
    Suner, Marie-Marthe
    Landrum, Melissa J.
    Aken, Bronwen
    Ayling, Sarah
    Baertsch, Robert
    Fernandez-Banet, Julio
    Cherry, Joshua L.
    Curwen, Val
    DiCuccio, Michael
    Kellis, Manolis
    Lee, Jennifer
    Lin, Michael F.
    Schuster, Michael
    Shkeda, Andrew
    Amid, Clara
    Brown, Garth
    Dukhanina, Oksana
    Frankish, Adam
    Hart, Jennifer
    Maidak, Bonnie L.
    Mudge, Jonathan
    Murphy, Michael R.
    Murphy, Terence
    Rajan, Jeena
    Rajput, Bhanu
    Riddick, Lillian D.
    Snow, Catherine
    Steward, Charles
    Webb, David
    Weber, Janet A.
    Wilming, Laurens
    Wu, Wenyu
    Birney, Ewan
    Haussler, David
    Hubbard, Tim
    Ostell, James
    Durbin, Richard
    Lipman, David
    [J]. GENOME RESEARCH, 2009, 19 (07) : 1316 - 1323
  • [37] Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study
    Rauch, Anita
    Wieczorek, Dagmar
    Graf, Elisabeth
    Wieland, Thomas
    Endele, Sabine
    Schwarzmayr, Thomas
    Albrecht, Beate
    Bartholdi, Deborah
    Beygo, Jasmin
    Di Donato, Nataliya
    Dufke, Andreas
    Cremer, Kirsten
    Hempel, Maja
    Horn, Denise
    Hoyer, Juliane
    Joset, Pascal
    Ropke, Albrecht
    Moog, Ute
    Riess, Angelika
    Thiel, Christian T.
    Tzschach, Andreas
    Wiesener, Antje
    Wohlleber, Eva
    Zweier, Christiane
    Ekici, Arif B.
    Zink, Alexander M.
    Rump, Andreas
    Meisinger, Christa
    Grallert, Harald
    Sticht, Heinrich
    Schenck, Annette
    Engels, Hartmut
    Rappold, Gudrun
    Schrock, Evelin
    Wieacker, Peter
    Riess, Olaf
    Meitinger, Thomas
    Reis, Andre
    Strom, Tim M.
    [J]. LANCET, 2012, 380 (9854) : 1674 - 1682
  • [38] Functional annotation of noncoding sequence variants
    Ritchie, Graham R. S.
    Dunham, Ian
    Zeggini, Eleftheria
    Flicek, Paul
    [J]. NATURE METHODS, 2014, 11 (03) : 294 - U351
  • [39] A framework for the interpretation of de novo mutation in human disease
    Samocha, Kaitlin E.
    Robinson, Elise B.
    Sanders, Stephan J.
    Stevens, Christine
    Sabo, Aniko
    McGrath, Lauren M.
    Kosmicki, Jack A.
    Rehnstrom, Karola
    Mallick, Swapan
    Kirby, Andrew
    Wall, Dennis P.
    MacArthur, Daniel G.
    Gabriel, Stacey B.
    DePristo, Mark
    Purcell, Shaun M.
    Palotie, Aarno
    Boerwinkle, Eric
    Buxbaum, Joseph D.
    Cook, Edwin H., Jr.
    Gibbs, Richard A.
    Schellenberg, Gerard D.
    Sutcliffe, James S.
    Devlin, Bernie
    Roeder, Kathryn
    Neale, Benjamin M.
    Daly, Mark J.
    [J]. NATURE GENETICS, 2014, 46 (09) : 944 - +
  • [40] De novo mutations revealed by whole-exome sequencing are strongly associated with autism
    Sanders, Stephan J.
    Murtha, Michael T.
    Gupta, Abha R.
    Murdoch, John D.
    Raubeson, Melanie J.
    Willsey, A. Jeremy
    Ercan-Sencicek, A. Gulhan
    DiLullo, Nicholas M.
    Parikshak, Neelroop N.
    Stein, Jason L.
    Walker, Michael F.
    Ober, Gordon T.
    Teran, Nicole A.
    Song, Youeun
    El-Fishawy, Paul
    Murtha, Ryan C.
    Choi, Murim
    Overton, John D.
    Bjornson, Robert D.
    Carriero, Nicholas J.
    Meyer, Kyle A.
    Bilguvar, Kaya
    Mane, Shrikant M.
    Sestan, Nenad
    Lifton, Richard P.
    Guenel, Murat
    Roeder, Kathryn
    Geschwind, Daniel H.
    Devlin, Bernie
    State, Matthew W.
    [J]. NATURE, 2012, 485 (7397) : 237 - U124