Identification of subfamily-specific sites based on active sites modeling and clustering

被引:30
作者
de Melo-Minardi, Raquel C. [1 ]
Bastard, Karine [2 ,3 ,4 ]
Artiguenave, Francois [2 ,3 ,4 ]
机构
[1] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
[2] Comissariat Energie Atom & Energies Alternat, Inst Genom, Evry, France
[3] CNRS, UMR 8030, Evry, France
[4] Univ Evry Val Essonne, F-91057 Evry, France
关键词
FUNCTIONAL SPECIFICITY; LIGAND-BINDING; PROTEIN; EVOLUTIONARY; PREDICTION; RESIDUES; SEQUENCE; PHYLOGENY; ALIGNMENT; DATABASE;
D O I
10.1093/bioinformatics/btq595
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Current computational approaches to function prediction are mostly based on protein sequence classification and transfer of annotation from known proteins to their closest homologous sequences relying on the orthology concept of function conservation. This approach suffers a major weakness: annotation reliability depends on global sequence similarity to known proteins and is poorly efficient for enzyme superfamilies that catalyze different reactions. Structural biology offers a different strategy to overcome the problem of annotation by adding information about protein 3D structures. This information can be used to identify amino acids located in active sites, focusing on detection of functional polymorphisms residues in an enzyme superfamily. Structural genomics programs are providing more and more novel protein structures at a high-throughput rate. However, there is still a huge gap between the number of sequences and available structures. Computational methods, such as homology modeling provides reliable approaches to bridge this gap and could be a new precise tool to annotate protein functions. Results: Here, we present Active Sites Modeling and Clustering (ASMC) method, a novel unsupervised method to classify sequences using structural information of protein pockets. ASMC combines homology modeling of family members, structural alignment of modeled active sites and a subsequent hierarchical conceptual classification. Comparison of profiles obtained from computed clusters allows the identification of residues correlated to subfamily function divergence, called specificity determining positions. ASMC method has been validated on a benchmark of 42 Pfam families for which previous resolved holo-structures were available. ASMC was also applied to several families containing known protein structures and comprehensive functional annotations. We will discuss how ASMC improves annotation and understanding of protein families functions by giving some specific illustrative examples on nucleotidyl cyclases, protein kinases and serine proteases.
引用
收藏
页码:3075 / 3082
页数:8
相关论文
共 48 条
[1]   Characterization and prediction of residues determining protein functional specificity [J].
Capra, John A. ;
Singh, Mona .
BIOINFORMATICS, 2008, 24 (13) :1473-1480
[2]   Functional specificity lies within the properties and evolutionary changes of amino acids [J].
Chakrabarti, Saikat ;
Bryant, Stephen H. ;
Panchenko, Anna R. .
JOURNAL OF MOLECULAR BIOLOGY, 2007, 373 (03) :801-810
[3]   Coevolution in defining the functional specificity [J].
Chakrabarti, Saikat ;
Panchenko, Anna R. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 75 (01) :231-240
[4]   Involvement of TSC genes and differential expression of other members of the mTOR signaling pathway in oral squamous cell carcinoma [J].
Chakraborty, Sanjukta ;
Mohiyuddin, S. M. Azeem ;
Gopinath, K. S. ;
Kumar, Arun .
BMC CANCER, 2008, 8 (1)
[5]   WebLogo: A sequence logo generator [J].
Crooks, GE ;
Hon, G ;
Chandonia, JM ;
Brenner, SE .
GENOME RESEARCH, 2004, 14 (06) :1188-1190
[6]   SDR: a database of predicted specificity-determining residues in proteins [J].
Donald, Jason E. ;
Shakhnovich, Eugene I. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D191-D194
[7]   Determining functional specificity from protein sequences [J].
Donald, JE ;
Shakhnovich, EI .
BIOINFORMATICS, 2005, 21 (11) :2629-2635
[8]  
Eswar Narayanan, 2008, V426, P145, DOI 10.1007/978-1-60327-058-8_8
[9]  
Fisher D. H., 1987, Machine Learning, V2, P139, DOI 10.1023/A:1022852608280
[10]   The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures [J].
Goldenberg, Ofir ;
Erez, Elana ;
Nimrod, Guy ;
Ben-Tal, Nir .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D323-D327