Prediction of Functional Regulatory SNPs in Monogenic and Complex Disease

被引:19
作者
Zhao, Yiqiang [1 ,2 ]
Clark, Wyatt T. [3 ]
Mort, Matthew [4 ]
Cooper, David N. [4 ]
Radivojac, Predrag [3 ]
Mooney, Sean D. [1 ,2 ]
机构
[1] Buck Inst Res Aging, Novato, CA 94945 USA
[2] Indiana Univ Sch Med, Dept Med & Mol Genet, Indianapolis, IN USA
[3] Indiana Univ, Sch Informat & Comp, Bloomington, IN USA
[4] Cardiff Univ, Sch Med, Inst Med Genet, Cardiff, S Glam, Wales
关键词
regulatory mutations; machine learning; monogenic disease; complex disease; single nucleotide polymorphisms; SNP; SINGLE-NUCLEOTIDE POLYMORPHISMS; INHERITED DISEASE; PROMOTER REGIONS; HUMAN GENOME; GENE DOSAGE; SEQUENCE; DATABASE; IDENTIFICATION; MUTATIONS; BIOINFORMATICS;
D O I
10.1002/humu.21559
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Next-generation sequencing (NGS) technologies are yielding ever higher volumes of human genome sequence data. Given this large amount of data, it has become both a possibility and a priority to determine how disease-causing single nucleotide polymorphisms (SNPs) detected within gene regulatory regions (rSNPs) exert their effects on gene expression. Recently, several studies have explored whether disease-causing polymorphisms have attributes that can distinguish them from those that are neutral, attaining moderate success at discriminating between functional and putatively neutral regulatory SNPs. Here, we have extended this work by assessing the utility of both SNP-based features (those associated only with the polymorphism site and the surrounding DNA) and gene-based features (those derived from the associated gene in whose regulatory region the SNP lies) in the identification of functional regulatory polymorphisms involved in either monogenic or complex disease. Gene-based features were found to be capable of both augmenting and enhancing the utility of SNP-based features in the prediction of known regulatory mutations. Adopting this approach, we achieved an AUC of 0.903 for predicting regulatory SNPs. Finally, our tool predicted 225 new regulatory SNPs with a high degree of confidence, with 105 of the 225 falling into linkage disequilibrium blocks of reported disease-associated genome-wide association studies SNPs. Hum Mutat 32:1183-1190, 2011. (C) 2011 Wiley-Liss, Inc.
引用
收藏
页码:1183 / 1190
页数:8
相关论文
共 48 条
[11]   Do Inherited Disease Genes Have Distinguishing Functional Characteristics? [J].
Cooper, David N. ;
Mort, Matthew .
GENETIC TESTING AND MOLECULAR BIOMARKERS, 2010, 14 (03) :289-291
[12]   Genes, Mutations, and Human Inherited Disease at the Dawn of the Age of Personalized Genomics [J].
Cooper, David N. ;
Chen, Jian-Min ;
Ball, Edward V. ;
Howells, Katy ;
Mort, Matthew ;
Phillips, Andrew D. ;
Chuzhanova, Nadia ;
Krawczak, Michael ;
Kehrer-Sawatzki, Hildegard ;
Stenson, Peter D. .
HUMAN MUTATION, 2010, 31 (06) :631-655
[13]   Changes in gene expression associated with loss of function of the NSDHL sterol dehydrogenase in mouse embryonic fibroblasts [J].
Cunningham, D ;
Swartzlander, D ;
Liyanarachchi, S ;
Davuluri, RV ;
Herman, GE .
JOURNAL OF LIPID RESEARCH, 2005, 46 (06) :1150-1162
[14]   The International HapMap Project [J].
Gibbs, RA ;
Belmont, JW ;
Hardenbol, P ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Ch'ang, LY ;
Huang, W ;
Liu, B ;
Shen, Y ;
Tam, PKH ;
Tsui, LC ;
Waye, MMY ;
Wong, JTF ;
Zeng, CQ ;
Zhang, QR ;
Chee, MS ;
Galver, LM ;
Kruglyak, S ;
Murray, SS ;
Oliphant, AR ;
Montpetit, A ;
Hudson, TJ ;
Chagnon, F ;
Ferretti, V ;
Leboeuf, M ;
Phillips, MS ;
Verner, A ;
Kwok, PY ;
Duan, SH ;
Lind, DL ;
Miller, RD ;
Rice, JP ;
Saccone, NL ;
Taillon-Miller, P ;
Xiao, M ;
Nakamura, Y ;
Sekine, A ;
Sorimachi, K ;
Tanaka, T ;
Tanaka, Y ;
Tsunoda, T ;
Yoshino, E ;
Bentley, DR ;
Deloukas, P ;
Hunt, S ;
Powell, D ;
Altshuler, D ;
Gabriel, SB ;
Qiu, RZ .
NATURE, 2003, 426 (6968) :789-796
[15]   The distribution of SNPs in human gene regulatory regions [J].
Guo, YJ ;
Jamison, DC .
BMC GENOMICS, 2005, 6 (1)
[16]  
Guyon I., 2003, J MACH LEARN RES, V3, P1157
[17]   Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks [J].
Hahn, MW ;
Kern, AD .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (04) :803-806
[18]  
Hastie T., 2001, ELEMENTS STAT LEARNI
[19]   The UCSC Genome Browser Database: 2008 update [J].
Karolchik, D. ;
Kuhn, R. M. ;
Baertsch, R. ;
Barber, G. P. ;
Clawson, H. ;
Diekhans, M. ;
Giardine, B. ;
Harte, R. A. ;
Hinrichs, A. S. ;
Hsu, F. ;
Kober, K. M. ;
Miller, W. ;
Pedersen, J. S. ;
Pohl, A. ;
Raney, B. J. ;
Rhead, B. ;
Rosenbloom, K. R. ;
Smith, K. E. ;
Stanke, M. ;
Thakkapallayil, A. ;
Trumbower, H. ;
Wang, T. ;
Zweig, A. S. ;
Haussler, D. ;
Kent, W. J. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D773-D779
[20]   Evidence for widespread degradation of gene control regions in hominid genomes [J].
Keightley, PD ;
Lercher, MJ ;
Eyre-Walker, A .
PLOS BIOLOGY, 2005, 3 (02) :282-288