Automated Feature Selection in Microarray Data Analysis using Deep Learning

被引:0
作者
Tekade, Pallavi [1 ]
Joshi, Ram [1 ]
Salunke, Dipmala [1 ]
Gore, Shubham [1 ]
Shinde, Shaunak [1 ]
Bahirat, Divya [1 ]
机构
[1] JSPMs Rajarshri Shahu Coll Engn, Informat Technol, Pune, Maharashtra, India
来源
2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024 | 2024年
关键词
Deep Learning; Feature Selection; Microarray Data; Bioinformatics; Genetic Markers;
D O I
10.1109/ICSCSS60660.2024.10625652
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer is the major reason of death around the world. However, timely identification and accurate prediction of cancer types play a pivotal role in safeguarding patient health. Microarray technology revolutionizes cancer detection by enabling the simultaneous examination of thousands of genes, simplifying the acquisition of extensive gene expression data. However, conventional cancer detection algorithms struggle with the vast amount of generated data. This study addresses these challenges by employing Principal Component Analysis (PCA) and Deep Learning, specifically Stacked Autoencoders, to reduce the dimensionality of large microarray datasets while preserving essential features. Utilizing these techniques, feature selection was conducted on datasets containing over 7000 features. The selected features were evaluated using standard regression and classification methods, with Logistic Regression emerging as the most effective, achieving an impressive 99.37% accuracy, followed by the Decision Tree classifier at 98.34%. Additionally, a Flask-based web application was developed to facilitate seamless CSV file upload for analysis, enhancing user accessibility and streamlining the data processing and analysis workflow. This user-friendly interface empowers researchers and practitioners to navigate through data complexities efficiently, fostering a more productive research environment.
引用
收藏
页码:1060 / 1066
页数:7
相关论文
共 23 条
[1]   Deep gene selection method to select genes from microarray datasets for cancer classification [J].
Alanni, Russul ;
Hou, Jingyu ;
Azzawi, Hasseeb ;
Xiang, Yong .
BMC BIOINFORMATICS, 2019, 20 (01)
[2]  
Aquino Nelson Marcelo Romero, 2017 IEEE LAT AM C C
[3]  
Bhui N., 2020, 2020 11 INT C COMP C, P1, DOI [10.1109/ICCCNT49239.2020.9225353, DOI 10.1109/ICCCNT49239.2020.9225353]
[4]   Gene Expression Data Analysis Using a Novel Approach to Biclustering Combining Discrete and Continuous Data [J].
Christinat, Yann ;
Wachmann, Bernd ;
Zhang, Lei .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2008, 5 (04) :583-593
[5]  
Damgacioglu Haluk, 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics, P1545
[6]  
Dey Umid Kumar, 1 INT C ADV SCI ENG
[7]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[8]  
Du Nan, 2013 IEEE INT C BIOI
[9]   Lightweight Convolutional Neural Network for Breast Cancer Classification Using RNA-Seq Gene Expression Data [J].
Elbashir, Murtada K. ;
Ezz, Mohamed ;
Mohammed, Mohanad ;
Saloum, Said S. .
IEEE ACCESS, 2019, 7 :185338-185348
[10]  
Jain I, 2018, 2018 CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (CICT'18)