Succinylation Site Prediction Based on Protein Sequences Using the IFS-LightGBM (BO) Model

被引:20
|
作者
Zhang, Lu [1 ]
Liu, Min [1 ]
Qin, Xinyi [1 ]
Liu, Guangzhong [1 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, 1550 Haigang Ave, Shanghai 201306, Peoples R China
基金
上海市自然科学基金;
关键词
LYSINE SUCCINYLATION; POSTTRANSLATIONAL MODIFICATION; UBIQUITINATION SITES; IDENTIFICATION; EXPRESSION; PATTERNS; SIRT5; TOOL;
D O I
10.1155/2020/8858489
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Succinylation is an important posttranslational modification of proteins, which plays a key role in protein conformation regulation and cellular function control. Many studies have shown that succinylation modification on protein lysine residue is closely related to the occurrence of many diseases. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. In this study, we develop a new model, IFS-LightGBM (BO), which utilizes the incremental feature selection (IFS) method, the LightGBM feature selection method, the Bayesian optimization algorithm, and the LightGBM classifier, to predict succinylation sites in proteins. Specifically, pseudo amino acid composition (PseAAC), position-specific scoring matrix (PSSM), disorder status, and Composition of k-spaced Amino Acid Pairs (CKSAAP) are firstly employed to extract feature information. Then, utilizing the combination of the LightGBM feature selection method and the incremental feature selection (IFS) method selects the optimal feature subset for the LightGBM classifier. Finally, to increase prediction accuracy and reduce the computation load, the Bayesian optimization algorithm is used to optimize the parameters of the LightGBM classifier. The results reveal that the IFS-LightGBM (BO)-based prediction model performs better when it is evaluated by some common metrics, such as accuracy, recall, precision, Matthews Correlation Coefficient (MCC), and F-measure.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Proteasomal cleavage site prediction of protein antigen using BP neural network based on a new set of amino acid descriptor
    Wang, Yuanqiang
    Lin, Yong
    Shu, Mao
    Wang, Rui
    Hu, Yong
    Lin, Zhihua
    JOURNAL OF MOLECULAR MODELING, 2013, 19 (08) : 3045 - 3052
  • [32] Ligand-binding site prediction using ligand-interacting and binding site-enriched protein triangles
    Xie, Zhong-Ru
    Hwang, Ming-Jing
    BIOINFORMATICS, 2012, 28 (12) : 1579 - 1585
  • [33] Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network
    Zhang, Tianhao
    Gu, Jiawei
    Wang, Zeyu
    Wu, Chunguo
    Liang, Yanchun
    Shi, Xiaohu
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2022, 14 (04) : 937 - 946
  • [34] An Accurate Method for Prediction of Protein-Ligand Binding Site on Protein Surface Using SVM and Statistical Depth Function
    Wang, Kui
    Gao, Jianzhao
    Shen, Shiyi
    Tuszynski, Jack A.
    Ruan, Jishou
    Hu, Gang
    BIOMED RESEARCH INTERNATIONAL, 2013, 2013
  • [35] Binding Site Detection and Druggability Prediction of Protein Targets for Structure-Based Drug Design
    Yuan, Yaxia
    Pei, Jianfeng
    Lai, Luhua
    CURRENT PHARMACEUTICAL DESIGN, 2013, 19 (12) : 2326 - 2333
  • [36] Mal-Prec: computational prediction of protein Malonylation sites via machine learning based feature integrationMalonylation site prediction
    Xin Liu
    Liang Wang
    Jian Li
    Junfeng Hu
    Xiao Zhang
    BMC Genomics, 21
  • [37] CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co -evolutionary information
    Peng, Zhenling
    Li, Zixia
    Meng, Qiaozhen
    Zhao, Bi
    Kurgan, Lukasz
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [38] Fast and accurate protein intrinsic disorder prediction by using a pretrained language model
    Song, Yidong
    Yuan, Qianmu
    Chen, Sheng
    Chen, Ken
    Zhou, Yaoqi
    Yang, Yuedong
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)
  • [39] Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences
    Wang, Wei
    Sun, Lin
    Zhang, Shiguang
    Zhang, Hongjun
    Shi, Jinling
    Xu, Tianhe
    Li, Keliang
    BMC BIOINFORMATICS, 2017, 18
  • [40] RNA Binding Protein-Based Model for Prognostic Prediction of Colorectal Cancer
    Li, Ting
    Hui, Wenjia
    Halike, Halina
    Gao, Feng
    TECHNOLOGY IN CANCER RESEARCH & TREATMENT, 2021, 20