Using information theory to search for co-evolving residues in proteins

被引:192
作者
Martin, LC
Gloor, GB
Dunn, SD
Wahl, LM [1 ]
机构
[1] Univ Western Ontario, Dept Appl Math, London, ON N6A 5B7, Canada
[2] Univ Western Ontario, Dept Biochem, London, ON N6A 5B7, Canada
基金
加拿大健康研究院; 加拿大自然科学与工程研究理事会;
关键词
D O I
10.1093/bioinformatics/bti671
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Some functionally important protein residues are easily detected since they correspond to conserved columns in a multiple sequence alignment (MSA). However important residues may also mutate, with compensatory mutations occurring elsewhere in the protein, which serve to preserve or restore functionality. It is difficult to distinguish these co-evolving sites from other non-conserved sites. Results: We used Mutual Information (MI) to identify co-evolving positions. Using in silico evolved MSAs, we examined the effects of the number of sequences, the size of amino acid alphabet and the mutation rate on two sources of background MI: finite sample size effects and phylogenetic influence. We then assessed the performance of various normalizations of MI in enhancing detection of co-evolving positions and found that normalization by the pair entropy was optimal. Real protein alignments were analyzed and co-evolving isolated pairs were often found to be in contact with each other.
引用
收藏
页码:4116 / 4124
页数:9
相关论文
共 19 条
  • [1] Alex B., 2004, NUCLEIC ACIDS RES, V32, pD138
  • [2] ASH RB, 1965, INFORMATION THEORY
  • [3] Correlations among amino acid sites in bHLH protein domains: An information theoretic analysis
    Atchley, WR
    Wollenberg, KR
    Fitch, WM
    Terhalle, W
    Dress, AW
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (01) : 164 - 178
  • [4] BARDERA A, 2004, P SPIE INT S MED IM
  • [5] COVARIATION OF RESIDUES IN THE HOMEODOMAIN SEQUENCE FAMILY
    CLARKE, ND
    [J]. PROTEIN SCIENCE, 1995, 4 (11) : 2269 - 2278
  • [6] Information-theoretic dissection of pairwise contact potentials
    Cline, MS
    Karplus, K
    Lathrop, RH
    Smith, TF
    Rogers, RG
    Haussler, D
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 49 (01) : 7 - 14
  • [7] Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
  • [8] Mutual information in protein multiple sequence alignments reveals two classes of coevolving positions
    Gloor, GB
    Martin, LC
    Wahl, LM
    Dunn, SD
    [J]. BIOCHEMISTRY, 2005, 44 (19) : 7156 - 7165
  • [9] COVARIATION OF MUTATIONS IN THE V3 LOOP OF HUMAN-IMMUNODEFICIENCY-VIRUS TYPE-1 ENVELOPE PROTEIN - AN INFORMATION-THEORETIC ANALYSIS
    KORBER, BTM
    FARBER, RM
    WOLPERT, DH
    LAPEDES, AS
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (15) : 7176 - 7180
  • [10] CDD: a conserved domain database for protein classification
    Marchler-Bauer, A
    Anderson, JB
    Cherukuri, PF
    DeWweese-Scott, C
    Geer, LY
    Gwadz, M
    He, SQ
    Hurwitz, DI
    Jackson, JD
    Ke, ZX
    Lanczycki, CJ
    Liebert, CA
    Liu, CL
    Lu, F
    Marchler, GH
    Mullokandov, M
    Shoemaker, BA
    Simonyan, V
    Song, JS
    Thiessen, PA
    Yamashita, RA
    Yin, JJ
    Zhang, DC
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D192 - D196