Prediction of protein amidation sites by feature selection and analysis

被引:0
作者
Weiren Cui
Shen Niu
Lulu Zheng
Lele Hu
Tao Huang
Lei Gu
Kaiyan Feng
Ning Zhang
Yudong Cai
Yixue Li
机构
[1] Institute of Systems Biology,CAS
[2] Shanghai University,MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences
[3] Chinese Academy of Sciences,Department of Mathematics, College of Science
[4] Shanghai University,Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences
[5] Chinese Academy of Sciences,Hubei Bioinformatics and Molecular Imaging Key Laboratory
[6] Shanghai Center for Bioinformation Technology,Division of Theoretical Bioinformatics (BO80)
[7] Huazhong University of Science and Technology,Tianjin Key Lab of BME Measurement, Department of Biomedical Engineering
[8] German Cancer Research Center,undefined
[9] Tianjin University,undefined
来源
Molecular Genetics and Genomics | 2013年 / 288卷
关键词
Amidation; Maximum relevance minimum redundancy; Incremental feature selection; Nearest neighbor algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Carboxy-terminal α-amidation is a widespread post-translational modification of proteins found widely in vertebrates and invertebrates. The α-amide group is required for full biological activity, since it may render a peptide more hydrophobic and thus better be able to bind to other proteins, preventing ionization of the C-terminus. However, in particular, the C-terminal amidation is very difficult to detect because experimental methods are often labor-intensive, time-consuming and expensive. Therefore, in silico methods may complement due to their high efficiency. In this study, a computational method was developed to predict protein amidation sites, by incorporating the maximum relevance minimum redundancy method and the incremental feature selection method based on the nearest neighbor algorithm. From a total of 735 features, 41 optimal features were selected and were utilized to construct the final predictor. As a result, the predictor achieved an overall Matthews correlation coefficient of 0.8308. Feature analysis showed that PSSM conservation scores and amino acid factors played the most important roles in the α-amidation site prediction. Site-specific feature analyses showed that features derived from the amidation site itself and adjacent sites were most significant. This method presented could be used as an efficient tool to theoretically predict amidated peptides. And the selected features from our study could shed some light on the in-depth understanding of the mechanisms of the amidation modification, providing guidelines for experimental validation.
引用
收藏
页码:391 / 400
页数:9
相关论文
共 50 条
  • [21] A new approach for fabric hand prediction with a nearest neighbor algorithm-based feature selection scheme
    Yu, Yong
    Hui, Chi Leung Patrick
    Choi, Tsan-Ming
    Ng, Sau Fun Frency
    TEXTILE RESEARCH JOURNAL, 2011, 81 (06) : 574 - 584
  • [22] A method to distinguish between lysine acetylation and lysine ubiquitination with feature selection and analysis
    Zhoua, You
    Zhang, Ning
    Li, Bi-Qing
    Huang, Tao
    Cai, Yu-Dong
    Kong, Xiang-Yin
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2015, 33 (11) : 2479 - 2490
  • [23] Chemical Labeling Method Based on Amidation for the Analysis of Phosphopeptides
    Zou L.-F.
    Zhang Q.-W.
    Xiao C.
    Xiong M.
    Zheng Q.
    Journal of Chinese Mass Spectrometry Society, 2022, 43 (04): : 463 - 472
  • [24] EcmPred: Prediction of extracellular matrix proteins based on random forest with maximum relevance minimum redundancy feature selection
    Kandaswamy, Krishna Kumar
    Pugalenthi, Ganesan
    Kalies, Kai-Uwe
    Hartmann, Enno
    Martinetz, Thomas
    JOURNAL OF THEORETICAL BIOLOGY, 2013, 317 : 377 - 383
  • [25] Identification and Analysis of Blood Gene Expression Signature for Osteoarthritis With Advanced Feature Selection Methods
    Li, Jing
    Lan, Chun-Na
    Kong, Ying
    Feng, Song-Shan
    Huang, Tao
    FRONTIERS IN GENETICS, 2018, 9
  • [26] Sequence-based Identification of Arginine Amidation Sites in Proteins Using Deep Representations of Proteins and PseAAC
    Naseer, Sheraz
    Hussain, Waqar
    Khan, Yaser Daanial
    Rasool, Nouman
    CURRENT BIOINFORMATICS, 2020, 15 (08) : 937 - 948
  • [27] Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm
    Wang, Shao Peng
    Zhang, Qing
    Lu, Jing
    Cai, Yu-Dong
    CURRENT BIOINFORMATICS, 2018, 13 (01) : 3 - 13
  • [28] Predicting Sumoylation Site by Feature Selection Method
    Cai, YuDong
    He, JianFeng
    Lu, Lin
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2011, 28 (05) : 797 - 804
  • [29] Fuzzy rough unlearning model for feature selection
    Tang, Yuxin
    Zhao, Suyun
    Chen, Hong
    Li, Cuiping
    Zhai, Junhai
    Zhou, Qiangjun
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2024, 165
  • [30] A Curriculum-Based Approach for Feature Selection
    Kalavala, Deepthi
    Bhagvati, Chakravarthy
    SECOND INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2017, 10443