Deep-Representation-Learning-Based Classification Strategy for Anticancer Peptides

被引:4
作者
Khan, Shujaat [1 ,2 ]
机构
[1] King Fahd Univ Petr & Minerals, Dept Comp Engn, Coll Comp & Math, Dhahran 31261, Saudi Arabia
[2] King Fahd Univ Petr & Minerals, KFUPM Joint Res Ctr Artificial Intelligence, SDAIA, Dhahran 31261, Saudi Arabia
关键词
anticancer peptide; composition of the g-spaced amino acid pairs; latent-space encoding; representation learning; auto-encoder; THERAPEUTIC PEPTIDES; CANCER STATISTICS; PREDICTION; IDENTIFICATION; EXPRESSION; MECHANISM; MODELS; TARGET;
D O I
10.3390/math12091330
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Cancer, with its complexity and numerous origins, continues to provide a huge challenge in medical research. Anticancer peptides are a potential treatment option, but identifying and synthesizing them on a large scale requires accurate prediction algorithms. This study presents an intuitive classification strategy, named ACP-LSE, based on representation learning, specifically, a deep latent-space encoding scheme. ACP-LSE can demonstrate notable advancements in classification outcomes, particularly in scenarios with limited sample sizes and abundant features. ACP-LSE differs from typical black-box approaches by focusing on representation learning. Utilizing an auto-encoder-inspired network, it embeds high-dimensional features, such as the composition of g-spaced amino acid pairs, into a compressed latent space. In contrast to conventional auto-encoders, ACP-LSE ensures that the learned feature set is both small and effective for classification, giving a transparent alternative. The suggested approach is tested on benchmark datasets and demonstrates higher performance compared to the current methods. The results indicate improved Matthew's correlation coefficient and balanced accuracy, offering insights into crucial aspects for developing new ACPs. The implementation of the proposed ACP-LSE approach is accessible online, providing a valuable and reproducible resource for researchers in the field.
引用
收藏
页数:18
相关论文
共 79 条
[31]   Cancer Statistics, 2009 [J].
Jemal, Ahmedin ;
Siegel, Rebecca ;
Ward, Elizabeth ;
Hao, Yongping ;
Xu, Jiaquan ;
Thun, Michael J. .
CA-A CANCER JOURNAL FOR CLINICIANS, 2009, 59 (04) :225-249
[32]  
Khan S., 2015, Master's Thesis
[33]   RAFP-Pred: Robust Prediction of Antifreeze Proteins Using Localized Analysis of n-Peptide Compositions [J].
Khan, Shujaat ;
Naseem, Imran ;
Togneri, Roberto ;
Bennamoun, Mohammed .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (01) :244-250
[34]  
Last F, 2017, Arxiv, DOI arXiv:1711.00837
[35]   Identifying anticancer peptides by using improved hybrid compositions [J].
Li, Feng-Min ;
Wang, Xiao-Qian .
SCIENTIFIC REPORTS, 2016, 6
[36]   Prediction of Anticancer Peptides Using a Low-Dimensional Feature Model [J].
Li, Qingwen ;
Zhou, Wenyang ;
Wang, Donghua ;
Wang, Sui ;
Li, Qingyuan .
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2020, 8
[37]   Clinical trials, progression-speed differentiating features and swiftness rule of the innovative targets of first-in-class drugs [J].
Li, Ying Hong ;
Li, Xiao Xu ;
Hong, Jia Jun ;
Wang, Yun Xia ;
Fu, Jian Bo ;
Yang, Hong ;
Yu, Chun Yan ;
Li, Feng Cheng ;
Hu, Jie ;
Xue, Wei Wei ;
Jiang, Yu Yang ;
Chen, Yu Zong ;
Zhu, Feng .
BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) :649-662
[38]   Cancer Diagnosis Through IsomiR Expression with Machine Learning Method [J].
Liao, Zhijun ;
Li, Dapeng ;
Wang, Xinrui ;
Li, Lisheng ;
Zou, Quan .
CURRENT BIOINFORMATICS, 2018, 13 (01) :57-63
[39]   Identification of Bone Metastasis-associated Genes of Gastric Cancer by Genome-wide Transcriptional Profiling [J].
Lin, Mingzhe ;
Li, Xin ;
Guo, Haizhou ;
Ji, Faxiang ;
Ye, Linhan ;
Ma, Xuemei ;
Cheng, Wen .
CURRENT BIOINFORMATICS, 2019, 14 (01) :62-69
[40]   BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches [J].
Liu, Bin ;
Gao, Xin ;
Zhang, Hanyu .
NUCLEIC ACIDS RESEARCH, 2019, 47 (20) :E127-E127