XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction

被引:101
|
作者
Zhong, Jiancheng [1 ]
Sun, Yusui [1 ]
Peng, Wei [2 ]
Xie, Minzhu [1 ]
Yang, Jiahong [1 ]
Tang, Xiwei [3 ]
机构
[1] Hunan Normal Univ, Sch Informat Sci & Engn, Changsha 410081, Hunan, Peoples R China
[2] Kunming Univ Sci & Technol, Comp Ctr, Kunming 650050, Yunnan, Peoples R China
[3] Hunan First Normal Univ, Dept Informat Sci & Engn, Changsha 410205, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Essential protein; feature engineering; multi-model fusion; XGBoost; SUB-EXPAND-SHRINK; XGBFEMF; ESSENTIAL GENES; SUBCELLULAR-LOCALIZATION; CENTRALITY; NETWORKS; DATABASE; GENOME; IDENTIFICATION; BETWEENNESS;
D O I
10.1109/TNB.2018.2842219
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Essential proteins as a vital part of maintaining the cells' life play an important role in the study of biology and drug design. With the generation of large amounts of biological data related to essential proteins, an increasing number of computational methods have been proposed. Different from the methods which adopt a single machine learning method or an ensemble machine learning method, this paper proposes a predicting framework named by XGBFEMF for identifying essential proteins, which includes a SUB-EXPAND-SHRINK method for constructing the composite features with original features and obtaining the better subset of features for essential protein prediction, and also includes a model fusion method for getting a more effective prediction model. We carry out experiments on Yeast data to assess the performance of the XGBFEMF with ROC analysis, accuracy analysis, and top analysis. Meanwhile, we set up experiments on E. coli data for the validation of performance. The test results show that the XGBFEMF framework can effectively improve many essential indicators. In addition, we analyze each step in the XGBFEMF framework; our results show that both each step of the SUB-EXPAND-SHRINK method as well as the step of multi-model fusion can improve prediction performance.
引用
收藏
页码:243 / 250
页数:8
相关论文
共 50 条
  • [41] Prediction of Essential Proteins Based on Local Interaction Density
    Qi, Yi
    Luo, Jiawei
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (06) : 1170 - 1182
  • [42] A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes
    Qin, Chao
    Sun, Yongqi
    Dong, Yadong
    PLOS ONE, 2016, 11 (08):
  • [43] Prediction of Essential Proteins Based on Overlapping Essential Modules
    Zhao, Bihai
    Wang, Jianxin
    Li, Min
    Wu, Fang-xiang
    Pan, Yi
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2014, 13 (04) : 415 - 424
  • [44] A novel scheme for essential protein discovery based on multi-source biological information
    Liu, Wei
    Ma, Liangyu
    Chen, Ling
    Chen, Bolun
    Jeon, Byeungwoo
    Qiang, Jipeng
    JOURNAL OF THEORETICAL BIOLOGY, 2020, 504
  • [45] Two New Methods for Identifying Essential Proteins Based on the Protein Complexes and Topological Properties
    Lu, Pengli
    Yu, Jingjuan
    IEEE ACCESS, 2020, 8 : 9578 - 9586
  • [46] Identifying essential proteins based on protein domains in protein-protein interaction networks
    Wang, Jianxin
    Peng, Wei
    Chen, Yingjiao
    Lu, Yu
    Pan, Yi
    2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
  • [47] XGBoost-based prediction of on-site acceleration response spectra with multi-feature inputs from P-wave arrivals
    Dai, Haozhen
    Zhou, Yueyong
    Liu, Heyi
    Li, Shanyou
    Wei, Yongxiang
    Song, Jindong
    SOIL DYNAMICS AND EARTHQUAKE ENGINEERING, 2024, 178
  • [48] XGBoost-based machine learning test improves the accuracy of hemorrhage prediction among geriatric patients with long-term administration of rivaroxaban
    Cheng Chen
    Chun Yin
    Yanhu Wang
    Jing Zeng
    Shuili Wang
    Yurong Bao
    Yixuan Xu
    Tongbo Liu
    Jiao Fan
    Xian Liu
    BMC Geriatrics, 23
  • [49] Used Car Price Prediction Based on the Iterative Framework of XGBoost+LightGBM
    Cui, Baoyang
    Ye, Zhonglin
    Zhao, Haixing
    Renqing, Zhuome
    Meng, Lei
    Yang, Yanlin
    ELECTRONICS, 2022, 11 (18)
  • [50] Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier
    Chen, Cheng
    Zhang, Qingmei
    Yu, Bin
    Yu, Zhaomin
    Lawrence, Patrick J.
    Ma, Qin
    Zhang, Yan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2020, 123