A phylogenetic mixture model for the identification of functionally divergent protein residues

被引:21
作者
Gaston, Daniel [1 ,2 ]
Susko, Edward [1 ,3 ]
Roger, Andrew J. [1 ,2 ]
机构
[1] Dalhousie Univ, Ctr Comparat Genom & Evolutionary Bioinformat, Halifax, NS B3H 1X5, Canada
[2] Dalhousie Univ, Dept Biochem & Mol Biol, Halifax, NS B3H 1X5, Canada
[3] Dalhousie Univ, Dept Math & Stat, Halifax, NS B3H 3J5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
EVOLUTIONARY TRACE; SEQUENCE HARMONY; SPECIFICITY; FAMILY; RATES; PREDICTION; INFERENCE; SIMULATOR; TREES;
D O I
10.1093/bioinformatics/btr470
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: To understand the evolution of molecular function within protein families, it is important to identify those amino acid residues responsible for functional divergence; i.e. those sites in a protein family that affect cofactor, protein or substrate binding preferences; affinity; catalysis; flexibility; or folding. Type I functional divergence (FD) results from changes in conservation ( evolutionary rate) at a site between protein subfamilies, whereas type II FD occurs when there has been a shift in preferences for different amino acid chemical properties. A variety of methods have been developed for identifying both site types in protein subfamilies, both from phylogenetic and information-theoretic angles. However, evaluation of the performance of these methods has typically relied upon a handful of reasonably well-characterized biological datasets or analyses of a single biological example. While experimental validation of many truly functionally divergent sites ( true positives) can be relatively straightforward, determining that particular sites do not contribute to functional divergence (i.e. false positives and true negatives) is much more difficult, resulting in noisy 'gold standard' examples. Results:We describe a novel, phylogeny-based functional divergence classifier, FunDi. Unlike previous approaches, FunDi uses a unified mixture model-based approach to detect type I and type II FD. To assess FunDi's overall classification performance relative to other methods, we introduce two methods for simulating functionally divergent datasets. We find that the FunDi method performs better than several other predictors over a wide variety of simulation conditions.
引用
收藏
页码:2655 / 2663
页数:9
相关论文
共 47 条
[1]  
[Anonymous], 2006, P 23 INT C MACH LEAR
[2]   Impact of taxon sampling on the estimation of rates of evolution at sites [J].
Blouin, C ;
Butt, D ;
Roger, AJ .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (03) :784-791
[3]   Multi-Harmony: detecting functional specificity from sequence alignment [J].
Brandt, Bernd W. ;
Feenstra, K. Anton ;
Heringa, Jaap .
NUCLEIC ACIDS RESEARCH, 2010, 38 :W35-W40
[4]   Prediction of specificity-determining residues for small-molecule kinase inhibitors [J].
Caffrey, Daniel R. ;
Lunney, Elizabeth A. ;
Moshinsky, Deborah J. .
BMC BIOINFORMATICS, 2008, 9 (1)
[5]   Characterization and prediction of residues determining protein functional specificity [J].
Capra, John A. ;
Singh, Mona .
BIOINFORMATICS, 2008, 24 (13) :1473-1480
[6]   Predicting functionally important residues from sequence conservation [J].
Capra, John A. ;
Singh, Mona .
BIOINFORMATICS, 2007, 23 (15) :1875-1882
[7]   Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure [J].
Capra, John A. ;
Laskowski, Roman A. ;
Thornton, Janet M. ;
Singh, Mona ;
Funkhouser, Thomas A. .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (12)
[8]   Functional specificity lies within the properties and evolutionary changes of amino acids [J].
Chakrabarti, Saikat ;
Bryant, Stephen H. ;
Panchenko, Anna R. .
JOURNAL OF MOLECULAR BIOLOGY, 2007, 373 (03) :801-810
[9]   Ensemble approach to predict specificity determinants: benchmarking and validation [J].
Chakrabarti, Saikat ;
Panchenko, Anna R. .
BMC BIOINFORMATICS, 2009, 10
[10]   Identification of subfamily-specific sites based on active sites modeling and clustering [J].
de Melo-Minardi, Raquel C. ;
Bastard, Karine ;
Artiguenave, Francois .
BIOINFORMATICS, 2010, 26 (24) :3075-3082