Amino acid sequence autocorrelation vectors and ensembles of Bayesian-regularized genetic neural networks for prediction of conformational stability of human lysozyme mutants

被引:56
作者
Caballero, Julio
Fernandez, Leyden
Abreu, Jose Ignacio
Fernandez, Michael [1 ]
机构
[1] Univ Matanzas, Fac Agron, Ctr Biotechnol Studies, Mol Modeling Grp, Matanzas 44740, Cuba
[2] Univ Matanzas, Fac Informat, Artificial Intelligence Lab, Matanzas 44740, Cuba
关键词
D O I
10.1021/ci050507z
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Development of novel computational approaches for modeling protein properties from their primary structure is a main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino Acid Sequence Autocorrelation ( AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex database. A total of 720 AASA descriptors were tested for building predictive models of the thermal unfolding Gibbs free energy change of human lysozyme mutants. In this sense, ensembles of Bayesian-Regularized Genetic Neural Networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 68% variance of the data in training and test sets, respectively. Furthermore, the optimum AASA vector subset was shown not only to successfully model unfolding thermal stability but also to distribute wild-type and mutant lysozymes on a stability Self-organized Map (SOM) when used for unsupervised training of competitive neurons.
引用
收藏
页码:1255 / 1268
页数:14
相关论文
共 75 条
[1]   On the use of neural network ensembles in QSAR and QSPR [J].
Agrafiotis, DK ;
Cedeño, W ;
Lobanov, VS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (04) :903-911
[2]   NEURAL NETWORKS APPLIED TO STRUCTURE-ACTIVITY-RELATIONSHIPS [J].
AOYAMA, T ;
SUZUKI, Y ;
ICHIKAWA, H .
JOURNAL OF MEDICINAL CHEMISTRY, 1990, 33 (03) :905-908
[3]   Locating biologically active compounds in medium-sized heterogeneous datasets by topological autocorrelation vectors: Dopamine and benzodiazepine agonists [J].
Bauknecht, H ;
Zell, A ;
Bayer, H ;
Levi, P ;
Wagener, M ;
Sadowski, J ;
Gasteiger, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (06) :1205-1213
[4]   Chance correlation in variable subset regression: Influence of the objective function, the selection mechanism, and ensemble averaging [J].
Baumann, K .
QSAR & COMBINATORIAL SCIENCE, 2005, 24 (09) :1033-1046
[5]   ProTherm, version 4.0: thermodynamic database for proteins and mutants [J].
Bava, KA ;
Gromiha, MM ;
Uedaira, H ;
Kitajima, K ;
Sarai, A .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D120-D121
[6]   Prudent modeling of core polar residues in computational protein design [J].
Bolon, DN ;
Marcus, JS ;
Ross, SA ;
Mayo, SL .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 329 (03) :611-622
[7]   Large-scale prediction of protein geometry and stability changes for arbitrary single point mutations [J].
Bordner, AJ ;
Abagyan, RA .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 57 (02) :400-413
[8]   Robust QSAR models using Bayesian regularized neural networks [J].
Burden, FR ;
Winkler, DA .
JOURNAL OF MEDICINAL CHEMISTRY, 1999, 42 (16) :3183-3187
[9]   Linear and nonlinear modeling of antifungal activity of some heterocyclic ring derivatives using multiple linear regression and Bayesian-regularized neural networks [J].
Caballero, J ;
Fernández, M .
JOURNAL OF MOLECULAR MODELING, 2006, 12 (02) :168-181
[10]   Predicting protein stability changes from sequences using support vector machines [J].
Capriotti, E ;
Fariselli, P ;
Calabrese, R ;
Casadio, R .
BIOINFORMATICS, 2005, 21 :54-58