An efficient consolidation of word embedding and deep learning techniques for classifying anticancer peptides: FastText+BiLSTM

被引:8
作者
Karakaya, Onur [1 ]
Kilimci, Zeynep Hilal [2 ]
机构
[1] Res & Dev Inc, Turkcell Technol, Istanbul, Turkiye
[2] Kocaeli Univ, Dept Informat Syst Engn, Kocaeli, Turkiye
关键词
Anticancer peptides; Word embeddings; Deep learning; FastText; Word2Vec; CNN; LSTM; BiLSTM; CLASSIFICATION; RESISTANCE; NETWORK; IACP;
D O I
10.7717/peerj-cs.1831
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anticancer peptides (ACPs) are a group of peptides that exhibit antineoplastic properties. The utilization of ACPs in cancer prevention can present a viable substitute for conventional cancer therapeutics, as they possess a higher degree of selectivity and safety. Recent scientific advancements generate an interest in peptide-based therapies which offer the advantage of efficiently treating intended cells without negatively impacting normal cells. However, as the number of peptide sequences continues to increase rapidly, developing a reliable and precise prediction model becomes a challenging task. In this work, our motivation is to advance an efficient model for categorizing anticancer peptides employing the consolidation of word embedding and deep learning models. First, Word2Vec, GloVe, FastText, One-Hot-Encoding approaches are evaluated as embedding techniques for the purpose of extracting peptide sequences. Then, the output of embedding models are fed into deep learning approaches CNN, LSTM, BiLSTM. To demonstrate the contribution of proposed framework, extensive experiments are carried on widely-used datasets in the literature, ACPs250 and independent. Experiment results show the usage of proposed model enhances classification accuracy when compared to the state -of -the -art studies. The proposed combination, FastText+BiLSTM, exhibits 92.50% of accuracy for ACPs250 dataset, and 96.15% of accuracy for the Independent dataset, thence determining new state -ofthe -art.
引用
收藏
页数:34
相关论文
共 49 条
[1]   Contextual deep learning-based audio-visual switching for speech enhancement in real-world environments [J].
Adeel, Ahsan ;
Gogate, Mandar ;
Hussain, Amir .
INFORMATION FUSION, 2020, 59 :163-170
[2]   ACP-MHCNN: an accurate multi-headed deep-convolutional neural network to predict anticancer peptides [J].
Ahmed, Sajid ;
Muhammod, Rafsanjani ;
Khan, Zahid Hossain ;
Adilina, Sheikh ;
Sharma, Alok ;
Shatabda, Swakkhar ;
Dehzangi, Abdollah .
SCIENTIFIC REPORTS, 2021, 11 (01)
[3]   cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Khan, Salman ;
Alarfaj, Fawaz Khaled .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 131
[4]   iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Iqbal, Muhammad ;
Jan, Mian Ahmad .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 79 :62-70
[5]   Benchmarking HEp-2 specimen cells classification using linear discriminant analysis on higher order spectra features of cell shape [J].
Al-Dulaimi, Khamael ;
Chandran, Vinod ;
Nguyen, Kien ;
Banks, Jasmine ;
Tomeo-Reyes, Inmaculada .
PATTERN RECOGNITION LETTERS, 2019, 125 :534-541
[6]   A State-of-the-Art Survey on Deep Learning Theory and Architectures [J].
Alom, Md Zahangir ;
Taha, Tarek M. ;
Yakopcic, Chris ;
Westberg, Stefan ;
Sidike, Paheding ;
Nasrin, Mst Shamima ;
Hasan, Mahmudul ;
Van Essen, Brian C. ;
Awwal, Abdul A. S. ;
Asari, Vijayan K. .
ELECTRONICS, 2019, 8 (03)
[7]   To Assist Oncologists: An Efficient Machine Learning-Based Approach for Anti-Cancer Peptides Classification [J].
Alsanea, Majed ;
Dukyil, Abdulsalam S. ;
Afnan ;
Riaz, Bushra ;
Alebeisat, Farhan ;
Islam, Muhammad ;
Habib, Shabana .
SENSORS, 2022, 22 (11)
[8]   Identifying child abuse through text mining and machine learning [J].
Amrit, Chintan ;
Paauw, Tim ;
Aly, Robin ;
Lavric, Miha .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 88 :402-418
[9]   iACP-MultiCNN: Multi-channel CNN based anticancer peptides identification [J].
Aziz, Abu Zahid Bin ;
Hasan, Md. Al Mehedi ;
Ahmad, Shamim ;
Al Mamun, Md. ;
Shin, Jungpil ;
Hossain, Md Rahat .
ANALYTICAL BIOCHEMISTRY, 2022, 652
[10]  
Bojanowski P., 2017, T ASS COMPUT LINGUIS, V5, P135, DOI [10.1162/tacla00051, DOI 10.1162/TACL_A_00051, 10.1162/tacl_a_00051, DOI 10.1162/TACLA00051]