A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses

被引:2
作者
Gong, Yu-Nong [2 ]
Chen, Guang-Wu [1 ,3 ]
Suchard, Marc A. [4 ,5 ,6 ]
机构
[1] Chang Gung Univ, Dept Comp Sci & Informat Engn, Tao Yuan 333, Taiwan
[2] Chang Gung Univ, Grad Inst Elect Engn, Tao Yuan 333, Taiwan
[3] Chang Gung Univ, Res Ctr Emerging Viral Infect, Tao Yuan 333, Taiwan
[4] Univ Calif Los Angeles, David Geffen Sch Med, Dept Biomath, Los Angeles, CA 90095 USA
[5] Univ Calif Los Angeles, David Geffen Sch Med, Dept Human Genet, Los Angeles, CA 90095 USA
[6] Univ Calif Los Angeles, Sch Publ Hlth, Dept Biostat, Los Angeles, CA 90024 USA
关键词
Co-evolution; Amino acid substitution matrix; Mutual information; Bayesian Evolutionary Analysis Sampling; Trees; Influenza virus; STRUCTURE PREDICTION; PROTEIN RESIDUES; HIGH VIRULENCE; HOST-RANGE; IDENTIFICATION; COEVOLUTION; EVOLUTION; ALIGNMENTS; POLYMERASE;
D O I
10.1016/j.compbiolchem.2012.06.004
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mutual information (MI) is an approach commonly used to estimate the evolutionary correlation of 2 amino acid sites. Although several MI methods exist, prior to our contribution no systematic method had been developed to assess their performance, or to establish numerical thresholds to detect co-evolving amino acid sites. The current study performed a Markov chain Monte Carlo (MCMC) algorithm on influenza viral sequences to capture their evolutionary characteristics. A consensus maximum clade credibility (MCC) tree was estimated from the samples, together with their amino acid substitution statistics, from which we generated synthetic sequences of known dependent and independent paired amino acid sites. A pair-to-pair and influenza-specific amino acid substitution matrix (P2PFLU) incorporated into Bayesian Evolutionary Analysis Sampling Trees (BEAST) enumerated these synthetic sequences. The sequences inherited evolutionary features and co-varying characteristics from the real viral sequences, rendering these synthetic data ideal for exploring their co-evolving features. For the MI measure, we proposed a novel metric called the empirical MI (MIEm), which outperformed other MI measures in analysis of receiver operating characteristics (ROC). We implemented our approach on 1086 all-time PB2 sequences of influenza A H5N1 viruses, in which we found 97 sites exhibiting co-evolutionary substitution of one or more amino acid sites. In particular, PB2 451, along with eight other PB2 sites of various MIEm scores, was found to co-evolve with PB2 627, a known species-associated amino acid residue which plays a critical role in influenza virus replication. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:20 / 28
页数:9
相关论文
共 39 条
[1]   The influenza virus resource at the national center for biotechnology information [J].
Bao, Yiming ;
Bolotov, Pavel ;
Dernovoy, Dmitry ;
Kiryutin, Boris ;
Zaslavsky, Leonid ;
Tatusova, Tatiana ;
Ostell, Jim ;
Lipman, David .
JOURNAL OF VIROLOGY, 2008, 82 (02) :596-601
[2]   Predicting functional gene links from phylogenetic-statistical analyses of whole genomes [J].
Barker, D ;
Pagel, M .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (01) :24-31
[3]  
Codoñer FM, 2008, EVOL BIOINFORM, V4, P29
[4]   FLU, an amino acid substitution model for influenza proteins [J].
Cuong Cao Dang ;
Le, Quang Si ;
Gascuel, Olivier ;
Vinh Sy Le .
BMC EVOLUTIONARY BIOLOGY, 2010, 10
[5]   BEAST: Bayesian evolutionary analysis by sampling trees [J].
Drummond, Alexei J. ;
Rambaut, Andrew .
BMC EVOLUTIONARY BIOLOGY, 2007, 7 (1)
[6]   A model-based approach for detecting coevolving positions in a molecule [J].
Dutheil, J ;
Pupko, T ;
Jean-Marie, A ;
Galtier, N .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (09) :1919-1928
[7]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[8]   A pair-to-pair amino acids substitution matrix and its applications for protein structure prediction [J].
Eyal, Eran ;
Frenkel-Morgenstern, Milana ;
Sobolev, Vladimir ;
Pietrokovski, Shmuel .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 67 (01) :142-153
[9]   A novel method for detecting intramolecular coevolution: Adding a further dimension to selective constraints analyses [J].
Fares, Mario A. ;
Travers, Simon A. A. .
GENETICS, 2006, 173 (01) :9-23
[10]  
Fawcett T, 2003, ORAL HLTH STATUS ORA