An Improved Convolutional Neural Network Model for DNA Classification

被引:6
作者
Soliman, Naglaa. F. [1 ]
Abd-Alhalem, Samia M. [2 ]
El-Shafai, Walid [2 ]
Abdulrahman, Salah Eldin S. E. [3 ]
Ismaiel, N. [3 ]
El-Rabaie, El-Sayed M. [2 ]
Algarni, Abeer D. [1 ]
Abd El-Samie, Fathi E. [1 ,2 ]
机构
[1] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, Riyadh, Saudi Arabia
[2] Menoufia Univ, Fac Elect Engn, Dept Elect & Elect Commun Engn, Menoufia 32952, Egypt
[3] Menoufia Univ, Fac Elect Engn, Dept Comp Sci & Engn, Menoufia 32952, Egypt
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2022年 / 70卷 / 03期
关键词
DNA classification; CNN; downsampling; hyperparameters; DL; 2D DT; 2D RP;
D O I
10.32604/cmc.2022.018860
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, deep learning (DL) became one of the essential tools in bioinformatics. A modified convolutional neural network (CNN) is employed in this paper for building an integrated model for deoxyribonucleic acid (DNA) classification. In any CNN model, convolutional layers are used to extract features followed by max-pooling layers to reduce the dimensionality of features. A novel method based on downsampling and CNNs is introduced for feature reduction. The downsampling is an improved form of the existing pooling layer to obtain better classification accuracy. The two-dimensional discrete transform (2D DT) and two-dimensional random projection (2D RP) methods are applied for downsampling. They convert the high-dimensional data to low-dimensional data and transform the data to the most significant feature vectors. However, there are parameters which directly affect how a CNN model is trained. In this paper, some issues concerned with the training of CNNs have been handled. The CNNs are examined by changing some hyperparameters such as the learning rate, size of minibatch, and the number of epochs. Training and assessment of the performance of CNNs are carried out on 16S rRNA bacterial sequences. Simulation results indicate that the utilization of a CNN based on wavelet subsampling yields the best trade-off between processing time and accuracy with a learning rate equal to 0.0001, a size of minibatch equal to 64, and a number of epochs equal to 20.
引用
收藏
页码:5907 / 5927
页数:21
相关论文
共 25 条
[1]   Optical PTFT Asymmetric Cryptosystem-Based Secure and Efficient Cancelable Biometric Recognition System [J].
Alarifi, Abdulaziz ;
Amoon, Mohammed ;
Aly, Moustafa H. ;
El-Shafai, Walid .
IEEE ACCESS, 2020, 8 :221246-221268
[2]  
[Anonymous], 2019, PATTERN ANAL APPL, DOI DOI 10.1007/s10044-018-0697-0
[3]  
[Anonymous], 2015, Deep learn. nat., DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]
[4]  
[Anonymous], 2012, INT C WAV AN PATT RE
[5]   SILVA, RDP, Greengenes, NCBI and OTT - how do these taxonomies compare? [J].
Balvociute, Monika ;
Huson, Daniel H. .
BMC GENOMICS, 2017, 18
[6]   DWT based Cancer Identification using EIIP [J].
Chakraborty, Shilpi ;
Gupta, Vinit .
2016 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2016, :718-723
[7]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[8]   Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique [J].
Greenspan, Hayit ;
van Ginneken, Bram ;
Summers, Ronald M. .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (05) :1153-1159
[9]  
La Rosa M., 2016, Lecture Notes in Computer Science), V9874, DOI [DOI 10.1007/978-3-319-44332-4_10, 10.1007/978-3-319-44332-4_, DOI 10.1007/978-3-319-44332-410]
[10]   Dual-source discrimination power analysis for multi-instance contactless palmprint recognition [J].
Leng, Lu ;
Li, Ming ;
Kim, Cheonshik ;
Bi, Xue .
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (01) :333-354