Automated inference of molecular mechanisms of disease from amino acid substitutions

被引:630
作者
Li, Biao [1 ]
Krishnan, Vidhya G. [2 ,3 ,4 ]
Mort, Matthew E. [3 ]
Xin, Fuxiao [1 ]
Kamati, Kishore K. [2 ,3 ]
Cooper, David N. [4 ]
Mooney, Sean D. [2 ,3 ]
Radivojac, Predrag [1 ]
机构
[1] Indiana Univ, Sch Informat & Comp, Bloomington, IN 47408 USA
[2] Buck Inst Age Res, Novato, CA 94945 USA
[3] Indiana Univ, Sch Med, Dept Med & Mol Genet, Indianapolis, IN 46202 USA
[4] Cardiff Univ, Sch Med, Inst Med Genet, Cardiff CF14 4XN, S Glam, Wales
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
SINGLE-NUCLEOTIDE POLYMORPHISMS; NON-SYNONYMOUS SNPS; PROTEIN-STRUCTURE; HUMAN CANCER; 6-PYRUVOYLTETRAHYDROPTERIN SYNTHASE; MISSENSE MUTATIONS; PREDICTION; PTEN; ANNOTATION; HYPERPHENYLALANINEMIA;
D O I
10.1093/bioinformatics/btp528
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Advances in high-throughput genotyping and next generation sequencing have generated a vast amount of human genetic variation data. Single nucleotide substitutions within protein coding regions are of particular importance owing to their potential to give rise to amino acid substitutions that affect protein structure and function which may ultimately lead to a disease state. Over the last decade, a number of computational methods have been developed to predict whether such amino acid substitutions result in an altered phenotype. Although these methods are useful in practice, and accurate for their intended purpose, they are not well suited for providing probabilistic estimates of the underlying disease mechanism. Results: We have developed a new computational model, MutPred, that is based upon protein sequence, and which models changes of structural features and functional sites between wild-type and mutant sequences. These changes, expressed as probabilities of gain or loss of structure and function, can provide insight into the specific molecular mechanism responsible for the disease state. MutPred also builds on the established SIFT method but offers improved classification accuracy with respect to human disease mutations. Given conservative thresholds on the predicted disruption of molecular function, we propose that MutPred can generate accurate and reliable hypotheses on the molecular basis of disease for similar to 11% of known inherited disease-causing mutations. We also note that the proportion of changes of functionally relevant residues in the sets of cancer-associated somatic mutations is higher than for the inherited lesions in the Human Gene Mutation Database which are instead predicted to be characterized by disruptions of protein structure.
引用
收藏
页码:2744 / 2750
页数:7
相关论文
共 59 条
[1]   Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information [J].
Ahmad, S ;
Gromiha, MM ;
Sarai, A .
BIOINFORMATICS, 2004, 20 (04) :477-486
[2]   Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information [J].
Bao, L ;
Cui, Y .
BIOINFORMATICS, 2005, 21 (10) :2185-2190
[3]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkh121, 10.1093/nar/gkr1065, 10.1093/nar/gkp985]
[4]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   SNAP: predict effect of non-synonymous polymorphisms on function [J].
Bromberg, Yana ;
Rost, Burkhard .
NUCLEIC ACIDS RESEARCH, 2007, 35 (11) :3823-3835
[7]   I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure [J].
Capriotti, E ;
Fariselli, P ;
Casadio, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W306-W310
[8]   Characterization of single-nucleotide polymorphisms in coding regions of human genes [J].
Cargill, M ;
Altshuler, D ;
Ireland, J ;
Sklar, P ;
Ardlie, K ;
Patil, N ;
Lane, CR ;
Lim, EP ;
Kalyanaraman, N ;
Nemesh, J ;
Ziaugra, L ;
Friedland, L ;
Rolfe, A ;
Warrington, J ;
Lipshutz, R ;
Daley, GQ ;
Lander, ES .
NATURE GENETICS, 1999, 22 (03) :231-238
[9]   Interpreting missense variants:: Comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR) [J].
Chan, Philip A. ;
Duraisamy, Sekhar ;
Miller, Peter J. ;
Newell, Joan A. ;
McBride, Carole ;
Bond, Jeffrey P. ;
Raevaara, Tiina ;
Ollila, Saara ;
Nystrom, Minna ;
Grimm, Andrew J. ;
Christodoulou, John ;
Oetting, William S. ;
Greenblatt, Marc S. .
HUMAN MUTATION, 2007, 28 (07) :683-693
[10]   BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009 [J].
Chang, Antje ;
Scheer, Maurice ;
Grote, Andreas ;
Schomburg, Ida ;
Schomburg, Dietmar .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D588-D592