共 27 条
Sequence-based Protein-Ca2+Binding Site Prediction Using SVM Classifier Finsemble with Random Under-Sampling
被引:0
作者:
Qiao, Liang
[1
]
Xie, Dongqing
[1
]
机构:
[1] Guangzhou Univ, Sch Math & Informat Sci, Guangzhou 510006, Guangdong, Peoples R China
来源:
PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC 2017)
|
2017年
基金:
中国国家自然科学基金;
关键词:
Protein-Ca2+binding site prediction;
Imbalanced data learning;
Random under sampling;
Support vector machine;
BINDING-SITES;
PROTEIN;
RESIDUES;
DATABASE;
D O I:
暂无
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Calcium ions (Ca2) are crucial for protein function. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. Accurately recognizing Ca2 -binding sites is of significant importance for protein function analysis. Although much progress has been made, challenges remain, especially in the post-genome era where large volume of proteins without being functional annotated are quickly accumulated. In this study, we design a new ab initio predictor, CaSite, to identify Ca2+-binding residues from protein sequence. CaSite first uses evolutionary information, predicted secondary structure, predicted solvent accessibility, and Jensen -Shannon divergence information to represent each residue sample feature. A mean ensemble classifier constructed based on support vector machines (SVM) from multiple random under -samplings is used as the final prediction model, which is effective for relieving the negative influence of the imbalance phenomenon between positive and negative training samples. Experimental results demonstrate that the proposed CaSite achieves a better prediction performance and outperforms the existing sequence -based predictor, TargetS.
引用
收藏
页码:86 / 90
页数:5
相关论文