DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders

被引:84
作者
Quinodoz, Mathieu [1 ]
Royer-Bertrand, Beryl [1 ,2 ]
Cisarova, Katarina [1 ]
Di Gioia, Silvio Alessandro [1 ]
Superti-Furga, Andrea [2 ]
Rivolta, Carlo [1 ,3 ]
机构
[1] Univ Lausanne, Unit Med Genet, Dept Computat Biol, CH-1011 Lausanne, Switzerland
[2] Lausanne Univ Hosp CHUV, Div Med Genet, CH-1011 Lausanne, Switzerland
[3] Univ Leicester, Dept Genet & Genome Biol, Leicester LE1 9HN, Leics, England
基金
瑞士国家科学基金会;
关键词
HUMAN-DISEASE; MUTATIONS; EVOLUTION; SELECTION; VARIANTS; LIFE;
D O I
10.1016/j.ajhg.2017.09.001
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
In contrast to recessive conditions with biallelic inheritance, identification of dominant (monoallelic) mutations for Mendelian disorders is more difficult, because of the abundance of benign heterozygous variants that act as massive background noise (typically, in a 400:1 excess ratio). To reduce this overflow of false positives in next-generation sequencing (NGS) screens, we developed DOMINO, a tool assessing the likelihood for a gene to harbor dominant changes. Unlike commonly-used predictors of pathogenicity, DOMINO takes into consideration features that are the properties of genes, rather than of variants. It uses a machine-learning approach to extract discriminant information from a broad array of features (N=432), including: genomic data, intra-, and interspecies conservation, gene expression, protein-protein interactions, protein structure, etc. DOMINO's iterative architecture includes a training process on 985 genes with well-established inheritance patterns for Mendelian conditions, and repeated cross-validation that optimizes its discriminant power. When validated on 99 newly-discovered genes with pathogenic mutations, the algorithm displays an excellent final performance, with an area under the curve (AUC) of 0.92. Furthermore, unsupervised analysis by DOMINO of real sets of NGS data from individuals with intellectual disability or epilepsy correctly recognizes known genes and predicts 9 new candidates, with very high confidence. In summary, DOMINO is a robust and reliable tool that can infer dominance of candidate genes with high sensitivity and specificity, making it a useful complement to any NGS pipeline dealing with the analysis of the morbid human genome.
引用
收藏
页码:623 / 629
页数:7
相关论文
共 27 条
[1]   Natural selection on genes that underlie human disease susceptibility [J].
Blekhman, Ran ;
Man, Orna ;
Herrmann, Leslie ;
Boyko, Adam R. ;
Indap, Amit ;
Kosiol, Carolin ;
Bustamante, Carlos D. ;
Teshima, Kosuke M. ;
Przeworskil, Molly .
CURRENT BIOLOGY, 2008, 18 (12) :883-889
[2]   Nosology and Classification of Genetic Skeletal Disorders: 2015 Revision [J].
Bonafe, Luisa ;
Cormier-Daire, Valerie ;
Hall, Christine ;
Lachman, Ralph ;
Mortier, Geert ;
Mundlos, Stefan ;
Nishimura, Gen ;
Sangiorgi, Luca ;
Savarirayan, Ravi ;
Sillence, David ;
Spranger, Juergen ;
Superti-Furga, Andrea ;
Warman, Matthew ;
Unger, Sheila .
AMERICAN JOURNAL OF MEDICAL GENETICS PART A, 2015, 167 (12) :2869-2892
[3]   Ataxia-Pancytopenia Syndrome Is Caused by Missense Mutations in SAMD9L [J].
Chen, Dong-Hui ;
Below, Jennifer E. ;
Shimamura, Akiko ;
Keel, Sioban B. ;
Matsushita, Mark ;
Wolff, John ;
Sul, Youngmee ;
Bonkowski, Emily ;
Castella, Maria ;
Taniguchi, Toshiyasu ;
Nickerson, Deborah ;
Papayannopoulou, Thalia ;
Bird, Thomas D. ;
Raskind, Wendy H. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 98 (06) :1146-1158
[4]   The Genetic Basis of Mendelian Phenotypes: Discoveries, Challenges, and Opportunities [J].
Chong, Jessica X. ;
Buckingham, Kati J. ;
Jhangiani, Shalini N. ;
Boehm, Corinne ;
Sobreira, Nara ;
Smith, Joshua D. ;
Harrell, Tanya M. ;
McMillin, Margaret J. ;
Wiszniewski, Wojciech ;
Gambin, Tomasz ;
Akdemir, Zeynep H. Coban ;
Doheny, Kimberly ;
Scott, Alan F. ;
Avramopoulos, Dimitri ;
Chakravarti, Aravinda ;
Hoover-Fong, Julie ;
Mathews, Debra ;
Witmer, P. Dane ;
Ling, Hua ;
Hetrick, Kurt ;
Watkins, Lee ;
Patterson, Karynne E. ;
Reinier, Frederic ;
Blue, Elizabeth ;
Muzny, Donna ;
Kircher, Martin ;
Bilguvar, Kaya ;
Lopez-Giraldez, Francesc ;
Sutton, V. Reid ;
Tabor, Holly K. ;
Lea, Suzanne M. ;
Gune, Murat ;
Mane, Shrikant ;
Gibbs, Richard A. ;
Boerwinkle, Eric ;
Hamosh, Ada ;
Shendure, Jay ;
Lupski, James R. ;
Lifton, Richard P. ;
Valle, David ;
Nickerson, Deborah A. ;
Bamshad, Michael J. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2015, 97 (02) :199-215
[5]   Autosomal-Dominant Corneal Endothelial Dystrophies CHED1 and PPCD1 Are Allelic Disorders Caused by Non-coding Mutations in the Promoter of OVOL2 [J].
Davidson, Alice E. ;
Liskova, Petra ;
Evans, Cerys J. ;
Dudakova, Lubica ;
Noskova, Lenka ;
Pontikos, Nikolas ;
Hartmannova, Hana ;
Hodanova, Katerina ;
Stranecky, Viktor ;
Kozmik, Zbynek ;
Levis, Hannah J. ;
Idigo, Nwamaka ;
Sasai, Noriaki ;
Maher, Geoffrey J. ;
Bellingham, James ;
Veli, Neyme ;
Ebenezer, Neil D. ;
Cheetham, Michael E. ;
Daniels, Julie T. ;
Thaung, Caroline M. H. ;
Jirsova, Katerina ;
Plagnol, Vincent ;
Filipec, Martin ;
Kmoch, Stanislav ;
Tuft, Stephen J. ;
Hardcastle, Alison J. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 98 (01) :75-89
[6]   Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies [J].
Dong, Chengliang ;
Wei, Peng ;
Jian, Xueqiu ;
Gibbs, Richard ;
Boerwinkle, Eric ;
Wang, Kai ;
Liu, Xiaoming .
HUMAN MOLECULAR GENETICS, 2015, 24 (08) :2125-2137
[7]   Disease gene identification strategies for exome sequencing [J].
Gilissen, Christian ;
Hoischen, Alexander ;
Brunner, Han G. ;
Veltman, Joris A. .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2012, 20 (05) :490-497
[8]  
Hastie T., 2009, ELEMENTS STAT LEARNI, V2, DOI [10.1007/978-0-387-84858-7, DOI 10.1007/978-0-387-84858-7]
[9]   Characterising and Predicting Haploinsufficiency in the Human Genome [J].
Huang, Ni ;
Lee, Insuk ;
Marcotte, Edward M. ;
Hurles, Matthew E. .
PLOS GENETICS, 2010, 6 (10) :1-11
[10]   Filtering for Compound Heterozygous Sequence Variants in Non-Consanguineous Pedigrees [J].
Kamphans, Tom ;
Sabri, Peggy ;
Zhu, Na ;
Heinrich, Verena ;
Mundlos, Stefan ;
Robinson, Peter N. ;
Parkhomchuk, Dmitri ;
Krawitz, Peter M. .
PLOS ONE, 2013, 8 (08)