DeepBP: Ensemble deep learning strategy for bioactive peptide prediction

被引：3

作者：

Zhang, Ming ^{[1
]}

Zhou, Jianren ^{[1
]}

Wang, Xiaohua ^{[1
]}

Wang, Xun ^{[1
]}

Ge, Fang ^{[2
,3
]}

机构：

[1] Jiangsu Univ Sci & Technol, Sch Comp, 666 Changhui Rd, Zhenjiang 212100, Peoples R China

[2] Nanjing Univ Posts & Telecommun, State Key Lab Organ Elect & Informat Displays, 9 Wenyuan Rd, Nanjing 210023, Peoples R China

[3] Nanjing Univ Posts & Telecommun, Inst Adv Mat IAM, 9 Wenyuan Rd, Nanjing 210023, Peoples R China

来源：

BMC BIOINFORMATICS | 2024年 / 25卷 / 01期

关键词：

ACE inhibitory peptides; Anticancer peptides; Protein language model; Gated recurrent unit; Generative adversarial capsule network; ATTENTION; NETWORKS; GRU; CNN;

D O I：

10.1186/s12859-024-05974-5

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

BackgroundBioactive peptides are important bioactive molecules composed of short-chain amino acids that play various crucial roles in the body, such as regulating physiological processes and promoting immune responses and antibacterial effects. Due to their significance, bioactive peptides have broad application potential in drug development, food science, and biotechnology. Among them, understanding their biological mechanisms will contribute to new ideas for drug discovery and disease treatment.ResultsThis study employs generative adversarial capsule networks (CapsuleGAN), gated recurrent units (GRU), and convolutional neural networks (CNN) as base classifiers to achieve ensemble learning through voting methods, which not only obtains high-precision prediction results on the angiotensin-converting enzyme (ACE) inhibitory peptides dataset and the anticancer peptides (ACP) dataset but also demonstrates effective model performance. For this method, we first utilized the protein language model-evolutionary scale modeling (ESM-2)-to extract relevant features for the ACE inhibitory peptides and ACP datasets. Following feature extraction, we trained three deep learning models-CapsuleGAN, GRU, and CNN-while continuously adjusting the model parameters throughout the training process. Finally, during the voting stage, different weights were assigned to the models based on their prediction accuracy, allowing full utilization of the model's performance. Experimental results show that on the ACE inhibitory peptide dataset, the balanced accuracy is 0.926, the Matthews correlation coefficient (MCC) is 0.831, and the area under the curve is 0.966; on the ACP dataset, the accuracy (ACC) is 0.779, and the MCC is 0.558. The experimental results on both datasets are superior to existing methods, demonstrating the effectiveness of the experimental approach.ConclusionIn this study, CapsuleGAN, GRU, and CNN were successfully employed as base classifiers to implement ensemble learning, which not only achieved good results in the prediction of two datasets but also surpassed existing methods. The ability to predict peptides with strong ACE inhibitory activity and ACPs more accurately and quickly is significant, and this work provides valuable insights for predicting other functional peptides. The source code and dataset for this experiment are publicly available at https://github.com/Zhou-Jianren/bioactive-peptides.

引用

页数：19

共 49 条

[1] Review of deep learning: concepts, CNN architectures, challenges, applications, future directions [J].

Alzubaidi, Laith ;

Zhang, Jinglan ;

Humaidi, Amjad J. ;

Al-Dujaili, Ayad ;

Duan, Ye ;

Al-Shamma, Omran ;

Santamaria, J. ;

Fadhel, Mohammed A. ;

Al-Amidie, Muthana ;

Farhan, Laith .

JOURNAL OF BIG DATA, 2021, 8 (01)

[2] ITP-Pred: an interpretable method for predicting, therapeutic peptides with fused features low-dimension representation [J].

Cai, Lijun ;

Wang, Li ;

Fu, Xiangzheng ;

Xia, Chenxing ;

Zeng, Xiangxiang ;

Zou, Quan .

BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)

[3] PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features [J].

Chandra, Abel ;

Sharma, Alok ;

Dehzangi, Iman ;

Tsunoda, Tatsuhiko ;

Sattar, Abdul .

SCIENTIFIC REPORTS, 2023, 13 (01)

[4] Accelerating the identification of the allergenic potential of plant proteins using a stacked ensemble-learning framework [J].

Charoenkwan, Phasit ;

Chumnanpuen, Pramote ;

Schaduangrat, Nalini ;

Shoombuatong, Watshara .

JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2024,

[5] PSRQSP: An effective approach for the interpretable prediction of quorum sensing peptide using propensity score representation learning [J].

Charoenkwan, Phasit ;

Chumnanpuen, Pramote ;

Schaduangrat, Nalini ;

Oh, Changmin ;

Manavalan, Balachandran ;

Shoombuatong, Watshara .

COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 158

[6] StackDPPIV: A novel computational approach for accurate prediction of dipeptidyl peptidase IV (DPP-IV) inhibitory peptides [J].

Charoenkwan, Phasit ;

Nantasenamat, Chanin ;

Hasan, Md Mehedi ;

Moni, Mohammad Ali ;

Lio, Pietro ;

Manavalan, Balachandran ;

Shoombuatong, Watshara .

METHODS, 2022, 204 :189-198

[7] CNN: A vision of complexity [J].

Chua, LO .

INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 1997, 7 (10) :2219-2425

[8]

Dey R, 2017, MIDWEST SYMP CIRCUIT, P1597, DOI 10.1109/MWSCAS.2017.8053243

[9] A survey on ensemble learning [J].

Dong, Xibin ;

Yu, Zhiwen ;

Cao, Wenming ;

Shi, Yifan ;

Ma, Qianli .

FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) :241-258

[10] MFSynDCP: multi-source feature collaborative interactive learning for drug combination synergy prediction [J].

Dong, Yunyun ;

Chang, Yunqing ;

Wang, Yuxiang ;

Han, Qixuan ;

Wen, Xiaoyuan ;

Yang, Ziting ;

Zhang, Yan ;

Qiang, Yan ;

Wu, Kun ;

Fan, Xiaole ;

Ren, Xiaoqiang .

BMC BIOINFORMATICS, 2024, 25 (01)

← 1 2 3 4 5 →