SVM-cabins: Prediction of solvent accessibility using accumulation cutoff set and support vector machine

被引：26

作者：

Wang, Jung-Ying

Lee, Hahn-Ming

Ahmad, Shandar ^{[1
]}

机构：

[1] Jamia Millia Islamia, Dept Biosci, New Delhi 110025, India

[2] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei 106, Taiwan

[3] Lunghwa Univ Sci & Technol, Dept Multimedia & Game Sci, Tao Yuan 333, Taiwan

[4] Acad Sinica, Inst Sci Informat, Taipei 115, Taiwan

[5] Natl Inst Biomed Innovat, Osaka, Japan

来源：

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS | 2007年 / 68卷 / 01期

关键词：

relative solvent accessibility; protein structure prediction; support vector machine;

D O I：

10.1002/prot.21422

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

A number of methods for predicting levels of solvent accessibility or accessible surface area (ASA) of amino acid residues in proteins have been developed. These methods either predict regularly spaced states of relative solvent accessibility or an analogue real value indicating relative solvent accessibility. While discrete states of exposure can be easily obtained by post prediction assignment of thresholds to the predicted or computed real values of ASA, the reverse, that is, obtaining a real value from quantized states of predicted ASA, is not straightforward as a two-state prediction in such cases would give a large real valued errors. However, prediction of ASA into larger number of ASA states and then finding a corresponding scheme for real value prediction may be helpful in integrating the two approaches of ASA prediction. We report a novel method of obtaining numerical real values of solvent accessibility, using accumulation cutoff set and support vector machine. This so-called SVM-Cabins method first predicts discrete states of ASA of amino acid residues from their evolutionary profile and then maps the predicted states onto a real valued linear space by simple algebraic methods. Resulting performance of such a rigorous approach using 13-state ASA prediction is at least comparable with the best methods of ASA prediction reported so far. The mean absolute error in this method reaches the best performance of 15.1% on the tested data set of 502 proteins with a coefficient of correlation equal to 0.66. Since, the method starts with the prediction of discrete states of ASA and leads to real value predictions, performance of prediction in binary states and real values are simultaneously optimized.

引用

页码：82 / 91

页数：10

共 25 条

[1] Accurate prediction of solvent accessibility using neural networks-based regression [J].