Computational analysis of gene expression data using bidirectional long short-term memory for disease diagnosis

被引:0
作者
Dasgupta, Srirupa [1 ,2 ]
Dutta, Mou [2 ]
Halder, Anindya [3 ,6 ]
Khan, Abhinandan [4 ]
Saha, Goutam [5 ]
Pal, Rajat Kumar [2 ]
机构
[1] Govt Coll Engn & Leather Technol, Dept Informat Technol, Kolkata, India
[2] Univ Calcutta, Dept Comp Sci & Engn, Kolkata, India
[3] North Eastern Hill Univ, Dept Comp Applict, Tura Campus, Tura, India
[4] Product Dev & Diversificat, ARP Engn, Kolkata, India
[5] Eastern Hill Univ, Dept Informat Technol, Shillong, India
[6] Cotton Univ, Dept Comp Sci & Informat Technol CSIT, Gauhati, India
关键词
Bi-LSTM; Classification; CBFS; Gene expression data; Relevant features; CLASSIFICATION; SELECTION; OPTIMIZATION; SUBGROUPS;
D O I
10.1007/s11334-022-00492-0
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Classification of cancer samples from microarray gene expression data is one of the important areas of research today. Traditional supervised techniques may not produce desired classification results because of their inability of extracting appropriate feature(s). In this context, here in this article, classification of microarray data sets is done using bidirectional long short-term memory (Bi-LSTM), a type of recurrent deep neural network. The curse of dimensionality issue is also handled using correlation-based feature selection. Experiments were carried out on eight benchmark cancer microarray data sets. The performance of the proposed method in terms of different validity measures is compared with three other deep learning-based classifiers and four conventional classifiers. It is observed that Bi-LSTM outperformed all other classifiers for all data sets. Further, it is observed that out of the total 56 paired t tests carried out for the proposed Bi-LSTM method, 42 cases are found to be statistically significant.
引用
收藏
页码:93 / 107
页数:15
相关论文
共 65 条
[1]   Kestrel-based Search Algorithm (KSA) for parameter tuning unto Long Short Term Memory (LSTM) Network for feature selection in classification of high-dimensional bioinformatics datasets. [J].
Agbehadji, Israel Edem ;
Millham, Richard ;
Fong, Simon James ;
Yang, Hongji .
PROCEEDINGS OF THE 2018 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2018, :15-20
[2]  
Aguilar-Ruiz JS, 2004, LECT NOTES COMPUT SC, V3181, P279
[3]  
Ahmed Omar, 2019, 2019 4th Scientific International Conference Najaf (SICN), P145, DOI 10.1109/SICN47020.2019.9019357
[4]  
Allen M., 2015, Multi-Domain Master Data Management: Advanced MDM and Data Governance in Practice, V1st
[5]  
[Anonymous], 2011, 2011 IEEE S COMPUTAT
[6]  
Behera B, 2019, INT CONF ADV COMPU, P220, DOI [10.1109/icoac48765.2019.246843, 10.1109/ICoAC48765.2019.246843]
[7]   USP8 Deubiquitinates the Leptin Receptor and Is Necessary for Leptin-Mediated Synapse Formation [J].
Bland, Tyler ;
Sahin, Gulcan Semra ;
Zhu, Mingyan ;
Dillon, Crystal ;
Impey, Soren ;
Appleyard, Suzanne M. ;
Wayman, Gary A. .
ENDOCRINOLOGY, 2019, 160 (08) :1982-1998
[8]   A review of microarray datasets and applied feature selection methods [J].
Bolon-Canedo, V. ;
Sanchez-Marono, N. ;
Alonso-Betanzos, A. ;
Benitez, J. M. ;
Herrera, F. .
INFORMATION SCIENCES, 2014, 282 :111-135
[9]   Changes in apoptosis-related pathways in acute myelocytic leukemia [J].
Casas, S ;
Ollila, J ;
Aventín, A ;
Vihinen, M ;
Sierra, J ;
Knuutila, S .
CANCER GENETICS AND CYTOGENETICS, 2003, 146 (02) :89-101
[10]  
Cho S.B., 2003, CRPITS 19 P OFTHE 1, P189