CPPred-FL: a sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning

被引:109
|
作者
Qiang, Xiaoli [1 ]
Zhou, Chen [2 ]
Ye, Xiucai [3 ]
Du, Pu-feng [4 ]
Su, Ran [5 ]
Wei, Leyi [2 ]
机构
[1] Guangzhou Univ, Inst Comp Sci & Technol, Guangzhou, Guangdong, Peoples R China
[2] Tianjin Univ, Sch Comp Sci &Technol, Tianjin 300000, Peoples R China
[3] Univ Tsukuba, Dept Comp Sci, Tsukuba Sci City, Tsukuba, Ibaraki, Japan
[4] Tianjin Univ, Coll Intelligence & Comp, Sch Comp Sci & Technol, Tianjin, Peoples R China
[5] Tianjin Univ, Sch Comp Software, Tianjin, Peoples R China
关键词
cell-penetrating peptide; feature representation learning; machine learning; sequence analysis; FEATURE-SELECTION; WEB SERVER; PROTEIN; SITES; SPOTS; DNA;
D O I
10.1093/bib/bby091
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Cell-penetrating peptides (CPPs) have been shown to be a transport vehicle for delivering cargoes into live cells, offering great potential as future therapeutics. It is essential to identify CPPs for better understanding of their functional mechanisms. Machine learning-based methods have recently emerged as a main approach for computational identification of CPPs. However, one of the main challenges and difficulties is to propose an effective feature representation model that sufficiently exploits the inner difference and relevance between CPPs and non-CPPs, in order to improve the predictive performance. In this paper, we have developed CPPred-FL, a powerful bioinformatics tool for fast, accurate and large-scale identification of CPPs. In our predictor, we introduce a new feature representation learning scheme that enables one to learn feature representations from totally 45 well-trained random forest models with multiple feature descriptors from different perspectives, such as compositional information, position-specific information and physicochemical properties, etc. We integrate class and probabilistic information into our feature representations. To improve the feature representation ability, we further remove redundant and irrelevant features by feature space optimization. Benchmarking experiments showed that CPPred-FL, using 19 informative features only, is able to achieve better performance than the state-of-the-art predictors. We anticipate that CPPred-FL will be a powerful tool for large-scale identification of CPPs, facilitating the characterization of their functional mechanisms and accelerating their applications in clinical therapy.
引用
收藏
页码:11 / 23
页数:13
相关论文
共 12 条
  • [1] CPPred-RF: A Sequence-based Predictor for Identifying Cell Penetrating Peptides and Their Uptake Efficiency
    Wei, Leyi
    Xing, PengWei
    Su, Ran
    Shi, Gaotao
    Ma, Zhanshan Sam
    Zou, Quan
    JOURNAL OF PROTEOME RESEARCH, 2017, 16 (05) : 2044 - 2053
  • [2] SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides
    Wei, Leyi
    Tang, Jijun
    Zou, Quan
    BMC GENOMICS, 2017, 18
  • [3] SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides
    Leyi Wei
    Jijun Tang
    Quan Zou
    BMC Genomics, 18
  • [4] A Novel Amino Acid Sequence-based Computational Approach to Predicting Cell-penetrating Peptides
    Tang, Jihui
    Ning, Jie
    Liu, Xiaoyan
    Wu, Baoming
    Hu, Rongfeng
    CURRENT COMPUTER-AIDED DRUG DESIGN, 2019, 15 (03) : 206 - 211
  • [5] ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides
    Wei, Leyi
    Zhou, Chen
    Chen, Huangrong
    Song, Jiangning
    Su, Ran
    BIOINFORMATICS, 2018, 34 (23) : 4007 - 4016
  • [6] ATGPred-FL: sequence-based prediction of autophagy proteins with feature representation learning
    Jiao, Shihu
    Chen, Zheng
    Zhang, Lichao
    Zhou, Xun
    Shi, Lei
    AMINO ACIDS, 2022, 54 (05) : 799 - 809
  • [7] ATGPred-FL: sequence-based prediction of autophagy proteins with feature representation learning
    Shihu Jiao
    Zheng Chen
    Lichao Zhang
    Xun Zhou
    Lei Shi
    Amino Acids, 2022, 54 : 799 - 809
  • [8] Meta-iAVP: A Sequence-Based Meta-Predictor for Improving the Prediction of Antiviral Peptides Using Effective Feature Representation
    Schaduangrat, Nalini
    Nantasenamat, Chanin
    Prachayasittikul, Virapong
    Shoombuatong, Watshara
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (22)
  • [9] SiameseCPP: a sequence-based Siamese network to predict cell -penetrating peptides by contrastive learning
    Zhang, Xin
    Wei, Lesong
    Ye, Xiucai
    Zhang, Kai
    Teng, Saisai
    Li, Zhongshen
    Jin, Junru
    Kim, Minjae
    Sakurai, Tetsuya
    Cui, Lizhen
    Manavalan, Balachandran
    Wei, Leyi
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [10] mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation
    Manavalan, Balachandran
    Basith, Shaherin
    Shin, Tae Hwan
    Wei, Leyi
    Lee, Gwang
    BIOINFORMATICS, 2019, 35 (16) : 2757 - 2765