Prediction of protein amidation sites by feature selection and analysis

被引：0

作者：

Weiren Cui

Shen Niu

Lulu Zheng

Lele Hu

Tao Huang

Lei Gu

Kaiyan Feng

Ning Zhang

Yudong Cai

Yixue Li

机构：

[1] Institute of Systems Biology,CAS

[2] Shanghai University,MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences

[3] Chinese Academy of Sciences,Department of Mathematics, College of Science

[4] Shanghai University,Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences

[5] Chinese Academy of Sciences,Hubei Bioinformatics and Molecular Imaging Key Laboratory

[6] Shanghai Center for Bioinformation Technology,Division of Theoretical Bioinformatics (BO80)

[7] Huazhong University of Science and Technology,Tianjin Key Lab of BME Measurement, Department of Biomedical Engineering

[8] German Cancer Research Center,undefined

[9] Tianjin University,undefined

来源：

Molecular Genetics and Genomics | 2013年 / 288卷

关键词：

Amidation; Maximum relevance minimum redundancy; Incremental feature selection; Nearest neighbor algorithm;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Carboxy-terminal α-amidation is a widespread post-translational modification of proteins found widely in vertebrates and invertebrates. The α-amide group is required for full biological activity, since it may render a peptide more hydrophobic and thus better be able to bind to other proteins, preventing ionization of the C-terminus. However, in particular, the C-terminal amidation is very difficult to detect because experimental methods are often labor-intensive, time-consuming and expensive. Therefore, in silico methods may complement due to their high efficiency. In this study, a computational method was developed to predict protein amidation sites, by incorporating the maximum relevance minimum redundancy method and the incremental feature selection method based on the nearest neighbor algorithm. From a total of 735 features, 41 optimal features were selected and were utilized to construct the final predictor. As a result, the predictor achieved an overall Matthews correlation coefficient of 0.8308. Feature analysis showed that PSSM conservation scores and amino acid factors played the most important roles in the α-amidation site prediction. Site-specific feature analyses showed that features derived from the amidation site itself and adjacent sites were most significant. This method presented could be used as an efficient tool to theoretically predict amidated peptides. And the selected features from our study could shed some light on the in-depth understanding of the mechanisms of the amidation modification, providing guidelines for experimental validation.

引用

页码：391 / 400

页数：9

共 50 条

[21] A new approach for fabric hand prediction with a nearest neighbor algorithm-based feature selection scheme
Yu, Yong
Hui, Chi Leung Patrick
Choi, Tsan-Ming
Ng, Sau Fun Frency
TEXTILE RESEARCH JOURNAL, 2011, 81 (06) : 574 - 584
[22] A method to distinguish between lysine acetylation and lysine ubiquitination with feature selection and analysis
Zhoua, You
Zhang, Ning
Li, Bi-Qing
Huang, Tao
Cai, Yu-Dong
Kong, Xiang-Yin
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2015, 33 (11) : 2479 - 2490
[23] Chemical Labeling Method Based on Amidation for the Analysis of Phosphopeptides
Zou L.-F.
Zhang Q.-W.
Xiao C.
Xiong M.
Zheng Q.
Journal of Chinese Mass Spectrometry Society, 2022, 43 (04): : 463 - 472
[24] EcmPred: Prediction of extracellular matrix proteins based on random forest with maximum relevance minimum redundancy feature selection
Kandaswamy, Krishna Kumar
Pugalenthi, Ganesan
Kalies, Kai-Uwe
Hartmann, Enno
Martinetz, Thomas
JOURNAL OF THEORETICAL BIOLOGY, 2013, 317 : 377 - 383
[25] Identification and Analysis of Blood Gene Expression Signature for Osteoarthritis With Advanced Feature Selection Methods
Li, Jing
Lan, Chun-Na
Kong, Ying
Feng, Song-Shan
Huang, Tao
FRONTIERS IN GENETICS, 2018, 9
[26] Sequence-based Identification of Arginine Amidation Sites in Proteins Using Deep Representations of Proteins and PseAAC
Naseer, Sheraz
Hussain, Waqar
Khan, Yaser Daanial
Rasool, Nouman
CURRENT BIOINFORMATICS, 2020, 15 (08) : 937 - 948
[27] Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm
Wang, Shao Peng
Zhang, Qing
Lu, Jing
Cai, Yu-Dong
CURRENT BIOINFORMATICS, 2018, 13 (01) : 3 - 13
[28] Predicting Sumoylation Site by Feature Selection Method
Cai, YuDong
He, JianFeng
Lu, Lin
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2011, 28 (05) : 797 - 804
[29] Fuzzy rough unlearning model for feature selection
Tang, Yuxin
Zhao, Suyun
Chen, Hong
Li, Cuiping
Zhai, Junhai
Zhou, Qiangjun
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2024, 165
[30] A Curriculum-Based Approach for Feature Selection
Kalavala, Deepthi
Bhagvati, Chakravarthy
SECOND INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2017, 10443

← 1 2 3 4 5 →