CoeViz: a web-based tool for coevolution analysis of protein residues

被引:36
作者
Baker, Frazier N. [1 ,2 ]
Porollo, Aleksey [2 ,3 ]
机构
[1] Univ Cincinnati, Dept Elect Engn & Comp Syst, 2901 Woodside Dr, Cincinnati, OH 45221 USA
[2] Cincinnati Childrens Hosp Med Ctr, Ctr Autoimmune Genom & Etiol, 3333 Burnet Ave, Cincinnati, OH 45229 USA
[3] Cincinnati Childrens Hosp Med Ctr, Div Biomed Informat, 3333 Burnet Ave, Cincinnati, OH 45229 USA
来源
BMC BIOINFORMATICS | 2016年 / 17卷
基金
美国国家卫生研究院;
关键词
Coevolution; Coevolution analysis; Coevolving residues; Co-occurring residues; Covariation of residues; Protein structure; Protein function; Protein annotation; Web-server; CORRELATED MUTATIONS; CONTACT PREDICTION; SEQUENCE; COVARIATION; INFORMATION; IDENTIFICATION; VISUALIZATION; LIKELIHOOD;
D O I
10.1186/s12859-016-0975-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a short window of residues) at a time may have difficulties in identifying and clustering the residues constituting a functional site, especially when a protein has multiple functions. Evolutionary information encoded in multiple sequence alignments is known to greatly improve sequence-based predictions. Identification of coevolving residues further advances the protein structure and function annotation by revealing cooperative pairs and higher order groupings of residues. Results: We present a new web-based tool (CoeViz) that provides a versatile analysis and visualization of pairwise coevolution of amino acid residues. The tool computes three covariance metrics: mutual information, chi-square statistic, Pearson correlation, and one conservation metric: joint Shannon entropy. Implemented adjustments of covariance scores include phylogeny correction, corrections for sequence dissimilarity and alignment gaps, and the average product correction. Visualization of residue relationships is enhanced by hierarchical cluster trees, heat maps, circular diagrams, and the residue highlighting in protein sequence and 3D structure. Unlike other existing tools, CoeViz is not limited to analyzing conserved domains or protein families and can process long, unstructured and multi-domain proteins thousands of residues long. Two examples are provided to illustrate the use of the tool for identification of residues (1) involved in enzymatic function, (2) forming short linear functional motifs, and (3) constituting a structural domain. Conclusions: CoeViz represents a practical resource for a quick sequence-based protein annotation for molecular biologists, e.g., for identifying putative functional clusters of residues and structural domains. CoeViz also can serve computational biologists as a resource of coevolution matrices, e.g., for developing machine learning-based prediction models. The presented tool is integrated in the POLYVIEW-2D server (http://polyview.cchmc.org/) and available from resulting pages of POLYVIEW-2D.
引用
收藏
页数:7
相关论文
共 37 条
[1]   Combining prediction of secondary structure and solvent accessibility in proteins [J].
Adamczak, R ;
Porollo, A ;
Meller, J .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 59 (03) :467-475
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   D3: Data-Driven Documents [J].
Bostock, Michael ;
Ogievetsky, Vadim ;
Heer, Jeffrey .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2011, 17 (12) :2301-2309
[5]   COVARIATION OF RESIDUES IN THE HOMEODOMAIN SEQUENCE FAMILY [J].
CLARKE, ND .
PROTEIN SCIENCE, 1995, 4 (11) :2269-2278
[6]   Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis [J].
Dago, Angel E. ;
Schug, Alexander ;
Procaccini, Andrea ;
Hoch, James A. ;
Weigt, Martin ;
Szurmant, Hendrik .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (26) :E1733-E1742
[7]   Attributes of short linear motifs [J].
Davey, Norman E. ;
Van Roey, Kim ;
Weatheritt, Robert J. ;
Toedt, Grischa ;
Uyar, Bora ;
Altenberg, Brigitte ;
Budd, Aidan ;
Diella, Francesca ;
Dinkel, Holger ;
Gibson, Toby J. .
MOLECULAR BIOSYSTEMS, 2012, 8 (01) :268-281
[8]   Emerging methods in protein co-evolution [J].
de Juan, David ;
Pazos, Florencio ;
Valencia, Alfonso .
NATURE REVIEWS GENETICS, 2013, 14 (04) :249-261
[9]   A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments [J].
Dekker, JP ;
Fodor, A ;
Aldrich, RW ;
Yellen, G .
BIOINFORMATICS, 2004, 20 (10) :1565-1572
[10]   Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction [J].
Dunn, S. D. ;
Wahl, L. M. ;
Gloor, G. B. .
BIOINFORMATICS, 2008, 24 (03) :333-340