Sequence analysis and rule development of predicting protein stability change upon mutation using decision tree model

被引:23
作者
Huang, Liang-Tsung
Gromiha, M. Michael
Ho, Shinn-Ying [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Biol Sci & Technol, Hsinchu 300, Taiwan
[2] Natl Chiao Tung Univ, Inst Bioinformat, Hsinchu 300, Taiwan
[3] Natl Inst Adv Ind Sci & Technol, CBRC, Koto Ku, Tokyo 1350064, Japan
[4] Ming Dao Univ, Dept Comp Sci & Informat Engn, Changhua 523, Taiwan
[5] Feng Chia Univ, Inst Comp Sci & Informat Engn, Taichung 407, Taiwan
关键词
bioinformatics; data mining; decision trees; prediction; protein stability;
D O I
10.1007/s00894-007-0197-4
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Understanding the mechanism of the protein stability change is one of the most challenging tasks. Recently, the prediction of protein stability change affected by single point mutations has become an interesting topic in molecular biology. However, it is desirable to further acquire knowledge from large databases to provide new insights into the nature of them. This paper presents an interpretable prediction tree method (named iPTREE-2) that can accurately predict changes of protein stability upon mutations from sequence based information and analyze sequence characteristics from the viewpoint of composition and order. Therefore, iPTREE-2 based on a regression tree algorithm exhibits the ability of finding important factors and developing rules for the purpose of data mining. On a dataset of 1859 different single point mutations from thermodynamic database, ProTherm, iPTREE-2 yields a correlation coefficient of 0.70 between predicted and experimental values. In the task of data mining, detailed analysis of sequences reveals the possibility of the compositional specificity of residues in different ranges of stability change and implies the existence of certain patterns. As building rules, we found that the mutation residues in wild type and in mutant protein play an important role. The present study demonstrates that iPTREE-2 can serve the purpose of predicting protein stability change, especially when one requires more understandable knowledge.
引用
收藏
页码:879 / 890
页数:12
相关论文
共 34 条
  • [1] [Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
  • [2] Use of classification regression tree in predicting oral absorption in humans
    Bai, JPF
    Utis, A
    Crippen, G
    He, HD
    Fischer, V
    Tullman, R
    Yin, HQ
    Hsu, CP
    Jiang, L
    Hwang, KK
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (06): : 2061 - 2069
  • [3] Baldi P., 2001, Bioinformatics: the machine learning approach
  • [4] ProTherm, version 4.0: thermodynamic database for proteins and mutants
    Bava, KA
    Gromiha, MM
    Uedaira, H
    Kitajima, K
    Sarai, A
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D120 - D121
  • [5] Prudent modeling of core polar residues in computational protein design
    Bolon, DN
    Marcus, JS
    Ross, SA
    Mayo, SL
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2003, 329 (03) : 611 - 622
  • [6] Large-scale prediction of protein geometry and stability changes for arbitrary single point mutations
    Bordner, AJ
    Abagyan, RA
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 57 (02) : 400 - 413
  • [7] I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
    Capriotti, E
    Fariselli, P
    Casadio, R
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : W306 - W310
  • [8] A neural-network-based method for predicting protein stability changes upon single point mutations
    Capriotti, Emidio
    Fariselli, Piero
    Casadio, Rita
    [J]. BIOINFORMATICS, 2004, 20 : 63 - 68
  • [9] CASADIO R, 1995, ISMB, V3, P81
  • [10] Prediction of protein stability changes for single-site mutations using support vector machines
    Cheng, JL
    Randall, A
    Baldi, P
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (04) : 1125 - 1132