Prediction Model of Thermophilic Protein Based on Stacking Method

被引:6
作者
Wang, Xian-Fang [1 ,2 ]
Lu, Fan [2 ]
Du, Zhi-Yong [1 ]
Li, Qi-Meng [2 ]
机构
[1] Henan Inst Technol, Sch Comp Sci & Technol, Xinxiang, Henan, Peoples R China
[2] Henan Normal Univ, Sch Comp & Informat Engn, Xinxiang, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
Thermophilic proteins; stacking; amino acid composition; g-gap; entropy density; autocorrelation coefficient; IDENTIFICATION;
D O I
10.2174/1574893616666210727152018
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Through the in-depth study of the thermophilic protein heat resistance principle, it is of great significance for people to deeply understand the folding, structure, function, and the evolution of proteins, and the directed design and modification of protein molecules in protein processing. Objective: Aiming at the problem of low accuracy and low efficiency of thermophilic protein prediction, a thermophilic protein prediction model based on the Stacking method is proposed. Methods: Based on the idea of Stacking, this paper uses five features extraction methods, including amino acid composition, g-gap dipeptide, encoding based on grouped weight, entropy density, and autocorrelation coefficient to characterize protein sequences for the selected standard data set. Then, the SVM based on the Gaussian kernel function is used to design the classification prediction model; by taking the prediction results of the five methods as the second layer input, the logistic regression model is used to integrate the experimental results to build a thermophilic protein prediction model based on the Stacking method. Results: The accuracy of the proposed method was found up to 93.75% when verified by the Jackknife method, and a number of performance evaluation indexes were observed to be higher than those of other models, and the overall performance better than that of most of the reported methods. Conclusion: The model presented in this paper has shown strong robustness and can significantly improve the prediction performance of thermophilic proteins.
引用
收藏
页码:1328 / 1340
页数:13
相关论文
共 40 条
  • [1] 基于多尺度卷积和循环神经网络的蛋白质二级结构预测
    包晨
    董洪伟
    钱军浩
    [J]. 基因组学与应用生物学, 2020, 39 (07) : 3025 - 3030
  • [2] Protein secondary structure prediction based on integration of CNN and LSTM model
    Cheng, Jinyong
    Liu, Yihui
    Ma, Yuming
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2020, 71
  • [3] Dao FY, 2017, CHINESE J BIOINFORM, V15, P1
  • [4] Impact of protein dynamics on secondary structure prediction
    de Brevern, Alexandre G.
    [J]. BIOCHIMIE, 2020, 179 : 14 - 22
  • [5] A glance into the evolution of template-free protein structure prediction methodologies
    Dhingra, Surbhi
    Sowdhamini, Ramanathan
    Cadet, Frederic
    Offmann, Bernard
    [J]. BIOCHIMIE, 2020, 175 : 85 - 92
  • [6] Identification of mitochondrial proteins of malaria parasite using analysis of variance
    Ding, Hui
    Li, Dongmei
    [J]. AMINO ACIDS, 2015, 47 (02) : 329 - 333
  • [7] iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels
    Ding, Hui
    Deng, En-Ze
    Yuan, Lu-Feng
    Liu, Li
    Lin, Hao
    Chen, Wei
    Chou, Kuo-Chen
    [J]. BIOMED RESEARCH INTERNATIONAL, 2014, 2014
  • [8] Du X, INT C COMP SCI ED, DOI [10.1109/ICCSE.2010.5593571, DOI 10.1109/ICCSE.2010.5593571]
  • [9] A two-stage SVM method to predict membrane protein types by incorporating amino acid classifications and physicochemical properties into a general form of Chou's PseAAC
    Han, Guo-Sheng
    Yu, Zu-Guo
    Vo Anh
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2014, 344 : 31 - 39
  • [10] Extracting Coevolutionary Features from Protein Sequences for Predicting Protein-Protein Interactions
    Hu, Lun
    Chan, Keith C. C.
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (01) : 155 - 166