Function Prediction of Peptide Toxins with Sequence-Based Multi-Tasking PU Learning Method

被引:1
|
作者
Chu, Yanyan [1 ,2 ,3 ]
Zhang, Huanhuan [1 ]
Zhang, Lei [3 ]
机构
[1] Ocean Univ China, Sch Med & Pharm, Qingdao 266003, Peoples R China
[2] Pilot Natl Lab Marine Sci & Technol Qingdao, Qingdao 266200, Peoples R China
[3] Ocean Univ China, Marine Biomed Res Inst Qingdao, Qingdao 266003, Peoples R China
基金
中国国家自然科学基金;
关键词
peptide toxin; active peptide; function prediction; PU learning; sequence-based; VENOM PEPTIDES; CHANNEL; THERAPEUTICS; CLASSIFIER; ZICONOTIDE; REVEAL; DESIGN;
D O I
10.3390/toxins14110811
中图分类号
TS2 [食品工业];
学科分类号
0832 ;
摘要
Peptide toxins generally have extreme pharmacological activities and provide a rich source for the discovery of drug leads. However, determining the optimal activity of a new peptide can be a long and expensive process. In this study, peptide toxins were retrieved from Uniprot; three positive-unlabeled (PU) learning schemes, adaptive basis classifier, two-step method, and PU bagging were adopted to develop models for predicting the biological function of new peptide toxins. All three schemes were embedded with 14 machine learning classifiers. The prediction results of the adaptive base classifier and the two-step method were highly consistent. The models with top comprehensive performances were further optimized by feature selection and hyperparameter tuning, and the models were validated by making predictions for 61 three-finger toxins or the external HemoPI dataset. Biological functions that can be identified by these models include cardiotoxicity, vasoactivity, lipid binding, hemolysis, neurotoxicity, postsynaptic neurotoxicity, hypotension, and cytolysis, with relatively weak predictions for hemostasis and presynaptic neurotoxicity. These models are discovery-prediction tools for active peptide toxins and are expected to accelerate the development of peptide toxins as drugs.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] SPPPred: Sequence-Based Protein-Peptide Binding Residue Prediction Using Genetic Programming and Ensemble Learning
    Shafiee, Shima
    Fathi, Abdolhossein
    Taherzadeh, Ghazaleh
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 2029 - 2040
  • [22] DeepCrystal: a deep learning framework for sequence-based protein crystallization prediction
    Elbasir, Abdurrahman
    Moovarkumudalvan, Balasubramanian
    Kunji, Khalid
    Kolatkar, Prasanna R.
    Mall, Raghvendra
    Bensmail, Halima
    BIOINFORMATICS, 2019, 35 (13) : 2216 - 2225
  • [23] Sequence representation approaches for sequence-based protein prediction tasks that use deep learning
    Cui, Feifei
    Zhang, Zilong
    Zou, Quan
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2021, 20 (01) : 61 - 73
  • [24] DeepCrystal: A Deep Learning Framework for Sequence-based Protein Crystallization Prediction
    Elbasir, Abdurrahman
    Moovarkumudalvan, Balasubramanian
    Kunji, Khalid
    Kolatkar, Prasanna R.
    Bensmail, Halima
    Mall, Raghvendra
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2747 - 2749
  • [25] DeepSol: a deep learning framework for sequence-based protein solubility prediction
    Khurana, Sameer
    Rawi, Reda
    Kunji, Khalid
    Chuang, Gwo-Yu
    Bensmail, Halima
    Mall, Raghvendra
    BIOINFORMATICS, 2018, 34 (15) : 2605 - 2613
  • [26] DEEPre: sequence-based enzyme EC number prediction by deep learning
    Li, Yu
    Wang, Sheng
    Umarov, Ramzan
    Xie, Bingqing
    Fan, Ming
    Li, Lihua
    Gao, Xin
    BIOINFORMATICS, 2018, 34 (05) : 760 - 769
  • [27] CRISPRpred(SEQ): a sequence-based method for sgRNA on target activity prediction using traditional machine learning
    Muhammad Rafid, Ali Haisam
    Toufikuzzaman, Md.
    Rahman, Mohammad Saifur
    Rahman, M. Sohel
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [28] Sequence-Based Antigenic Change Prediction by a Sparse Learning Method Incorporating Co-Evolutionary Information
    Yang, Jialiang
    Zhang, Tong
    Wan, Xiu-Feng
    PLOS ONE, 2014, 9 (09):
  • [29] CRISPRpred(SEQ): a sequence-based method for sgRNA on target activity prediction using traditional machine learning
    Ali Haisam Muhammad Rafid
    Md. Toufikuzzaman
    Mohammad Saifur Rahman
    M. Sohel Rahman
    BMC Bioinformatics, 21
  • [30] Compressor airfoil optimization method driven by data-mechanism integration based on evolutionary multi-tasking algorithm
    Cheng, Jinxin
    Zhang, Yong
    Chen, Jiang
    Ma, Hui
    Liu, Beiying
    AEROSPACE SCIENCE AND TECHNOLOGY, 2024, 148