XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction

被引:101
|
作者
Zhong, Jiancheng [1 ]
Sun, Yusui [1 ]
Peng, Wei [2 ]
Xie, Minzhu [1 ]
Yang, Jiahong [1 ]
Tang, Xiwei [3 ]
机构
[1] Hunan Normal Univ, Sch Informat Sci & Engn, Changsha 410081, Hunan, Peoples R China
[2] Kunming Univ Sci & Technol, Comp Ctr, Kunming 650050, Yunnan, Peoples R China
[3] Hunan First Normal Univ, Dept Informat Sci & Engn, Changsha 410205, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Essential protein; feature engineering; multi-model fusion; XGBoost; SUB-EXPAND-SHRINK; XGBFEMF; ESSENTIAL GENES; SUBCELLULAR-LOCALIZATION; CENTRALITY; NETWORKS; DATABASE; GENOME; IDENTIFICATION; BETWEENNESS;
D O I
10.1109/TNB.2018.2842219
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Essential proteins as a vital part of maintaining the cells' life play an important role in the study of biology and drug design. With the generation of large amounts of biological data related to essential proteins, an increasing number of computational methods have been proposed. Different from the methods which adopt a single machine learning method or an ensemble machine learning method, this paper proposes a predicting framework named by XGBFEMF for identifying essential proteins, which includes a SUB-EXPAND-SHRINK method for constructing the composite features with original features and obtaining the better subset of features for essential protein prediction, and also includes a model fusion method for getting a more effective prediction model. We carry out experiments on Yeast data to assess the performance of the XGBFEMF with ROC analysis, accuracy analysis, and top analysis. Meanwhile, we set up experiments on E. coli data for the validation of performance. The test results show that the XGBFEMF framework can effectively improve many essential indicators. In addition, we analyze each step in the XGBFEMF framework; our results show that both each step of the SUB-EXPAND-SHRINK method as well as the step of multi-model fusion can improve prediction performance.
引用
收藏
页码:243 / 250
页数:8
相关论文
共 50 条
  • [21] XGBoost-based method for flash flood risk assessment
    Ma, Meihong
    Zhao, Gang
    He, Bingshun
    Li, Qing
    Dong, Haoyue
    Wang, Shenggang
    Wang, Zhongliang
    JOURNAL OF HYDROLOGY, 2021, 598
  • [22] Novel Feature-Based Difficulty Prediction Method for Mathematics Items Using XGBoost-Based SHAP Model
    Yi, Xifan
    Sun, Jianing
    Wu, Xiaopeng
    MATHEMATICS, 2024, 12 (10)
  • [23] Essential Protein Identification Based on Essential Protein-Protein Interaction Prediction by Integrated Edge Weights
    Jiang, Yuexu
    Wang, Yan
    Pang, Wei
    Chen, Liang
    Sun, Huiyan
    Liang, Yanchun
    Blanzieri, Enrico
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [24] An XGBoost-Based Approach for an Efficient RPL Routing Attack Detection
    Yaakoubi, Faicel
    Yahyaoui, Aymen
    Boulila, Wadii
    Attia, Rabah
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 611 - 623
  • [25] XGBoost-based and tumor-immune characterized gene signature for the prediction of metastatic status in breast cancer
    Qingqing Li
    Hui Yang
    Peipei Wang
    Xiaocen Liu
    Kun Lv
    Mingquan Ye
    Journal of Translational Medicine, 20
  • [26] XGBoost-based Music Emotion Recognition with Emobase Emotional Features
    Kyaw, Pyi Bhone
    Cho, Li
    2024 11TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN, ICCE-TAIWAN 2024, 2024, : 271 - 272
  • [27] XGBoost-based model for predicting hydrogen content in electroslag remelting
    Liu, Yu-xiao
    Dong, Yan-wu
    Jiang, Zhou-hua
    Li, Yu-shuo
    Zha, Wei
    Du, Yao-xin
    Du, Shu-yang
    JOURNAL OF IRON AND STEEL RESEARCH INTERNATIONAL, 2023, 30 (05) : 887 - 896
  • [28] XGBoost-based and tumor-immune characterized gene signature for the prediction of metastatic status in breast cancer
    Li, Qingqing
    Yang, Hui
    Wang, Peipei
    Liu, Xiaocen
    Lv, Kun
    Ye, Mingquan
    JOURNAL OF TRANSLATIONAL MEDICINE, 2022, 20 (01)
  • [29] A disease-related essential protein prediction model based on the transfer neural network
    Chen, Sisi
    Huang, Chiguo
    Wang, Lei
    Zhou, Shunxian
    FRONTIERS IN GENETICS, 2023, 13
  • [30] XGBoost-based model for predicting hydrogen content in electroslag remelting
    Yu-xiao Liu
    Yan-wu Dong
    Zhou-hua Jiang
    Yu-shuo Li
    Wei Zha
    Yao-xin Du
    Shu-yang Du
    Journal of Iron and Steel Research International, 2023, 30 : 887 - 896