Artificial Intelligence Technique for Gene Expression by Tumor RNA-Seq Data: A Novel Optimized Deep Learning Approach

被引:70
作者
Khalifa, Nour Eldeen M. [1 ]
Taha, Mohamed Hamed N. [1 ]
Ali, Dalia Ezzat [1 ]
Slowik, Adam [2 ]
Hassanien, Abdul Ella [1 ]
机构
[1] Cairo Univ, Fac Comp & Artificial Intelligence, Giza 12613, Egypt
[2] Koszalin Univ Technol, Dept Elect & Comp Sci, PL-75453 Koszalin, Poland
关键词
Cancer; RNA sequence; deep convolutional neural network; gene expression data; COMPUTER VISION; PREDICTION;
D O I
10.1109/ACCESS.2020.2970210
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cancer is one of the most feared and aggressive diseases in the world and is responsible for more than 9 million deaths universally. Staging cancer early increases the chances of recovery. One staging technique is RNA sequence analysis. Recent advances in the efficiency and accuracy of artificial intelligence techniques and optimization algorithms have facilitated the analysis of human genomics. This paper introduces a novel optimized deep learning approach based on binary particle swarm optimization with decision tree (BPSO-DT) and convolutional neural network (CNN) to classify different types of cancer based on tumor RNA sequence (RNA-Seq) gene expression data. The cancer types that will be investigated in this research are kidney renal clear cell carcinoma (KIRC), breast invasive carcinoma (BRCA), lung squamous cell carcinoma (LUSC), lung adenocarcinoma (LUAD) and uterine corpus endometrial carcinoma (UCEC). The proposed approach consists of three phases. The first phase is preprocessing, which at first optimize the high-dimensional RNA-seq to select only optimal features using BPSO-DT and then, converts the optimized RNA-Seq to 2D images. The second phase is augmentation, which increases the original dataset of 2086 samples to be 5 times larger. The selection of the augmentations techniques was based achieving the least impact on manipulating the features of the images. This phase helps to overcome the overfitting problem and trains the model to achieve better accuracy. The third phase is deep CNN architecture. In this phase, an architecture of two main convolutional layers for featured extraction and two fully connected layers is introduced to classify the 5 different types of cancer according to the availability of images on the dataset. The results and the performance metrics such as recall, precision and F1 score show that the proposed approach achieved an overall testing accuracy of 96.9%. The comparative results are introduced, and the proposed method outperforms those in related works in terms of testing accuracy for 5 classes of cancer. Moreover, the proposed approach is less complex and consume less memory.
引用
收藏
页码:22874 / 22883
页数:10
相关论文
共 48 条
[1]  
[Anonymous], 2017, ARXIV170902245
[2]  
[Anonymous], 2018, cancer types: RNA sequencing values from tumour samples/tissues, Distributed by Mendeley
[3]  
Bray F, 2018, CA-CANCER J CLIN, V68, P394, DOI [10.3322/caac.21609, 10.3322/caac.21492]
[4]   Computer vision and deep learning techniques for pedestrian detection and tracking: A survey [J].
Brunetti, Antonio ;
Buongiorno, Domenico ;
Trotta, Gianpaolo Francesco ;
Bevilacqua, Vitoantonio .
NEUROCOMPUTING, 2018, 300 :17-33
[5]   Deep Learning and Its Applications in Biomedicine [J].
Cao, Chensi ;
Liu, Feng ;
Tan, Hai ;
Song, Deshou ;
Shu, Wenjie ;
Li, Weizhong ;
Zhou, Yiming ;
Bo, Xiaochen ;
Xie, Zhi .
GENOMICS PROTEOMICS & BIOINFORMATICS, 2018, 16 (01) :17-32
[6]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[7]  
Chopard B., 2018, An Introduction toMetaheuristics for Optimization, DOI [DOI 10.1007/978-3-319-93073-26, 10.1007/978-3-319-93073-2]
[8]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[9]   Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks [J].
Ciresan, Dan C. ;
Giusti, Alessandro ;
Gambardella, Luca M. ;
Schmidhuber, Juergen .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2013, PT II, 2013, 8150 :411-418
[10]  
Danaee P, 2017, BIOCOMPUT-PAC SYM, P219