Identifying Effective Feature Selection Methods for Alzheimer's Disease Biomarker Gene Detection Using Machine Learning

被引:5
作者
Alshamlan, Hala [1 ]
Omar, Samar [1 ]
Aljurayyad, Rehab [1 ]
Alabduljabbar, Reham [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 145111, Riyadh 4545, Saudi Arabia
关键词
data mining; genetic disease prediction; Alzheimer disease; gene expression; feature selection; classification; IDENTIFICATION;
D O I
10.3390/diagnostics13101771
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Alzheimer's disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in the disease process. The aim of this research is to identify the most effective model for detecting biomarker genes associated with AD using several feature selection methods. We compared the efficiency of feature selection methods with an SVM classifier, including mRMR, CFS, the Chi-Square Test, F-score, and GA. We calculated the accuracy of the SVM classifier using validation methods such as 10-fold cross-validation. We applied these feature selection methods with SVM to a benchmark AD gene expression dataset consisting of 696 samples and 200 genes. The results indicate that the mRMR and F-score feature selection methods with SVM classifier achieved a high accuracy of around 84%, with a number of genes between 20 and 40. Furthermore, the mRMR and F-score feature selection methods with SVM classifier outperformed the GA, Chi-Square Test, and CFS methods. Overall, these findings suggest that the mRMR and F-score feature selection methods with SVM classifier are effective in identifying biomarker genes related to AD and could potentially lead to more accurate diagnosis and treatment of the disease.
引用
收藏
页数:14
相关论文
共 31 条
[1]  
AlzGene, US
[2]   2018 Alzheimer's disease facts and figures [J].
不详 .
ALZHEIMERS & DEMENTIA, 2018, 14 (03) :367-425
[3]  
Alzheimer's Disease and Dementia, GENETICS
[4]  
Alzheimer's Disease and Dementia, WHAT IS DEM
[5]  
[Anonymous], REV MICR DAT APPL FE
[6]  
[Anonymous], SENIORS HLTH OV ALZH
[7]  
[Anonymous], SAUDIALZAHEIMERS DIS
[8]  
[Anonymous], HIST ALZHEIMERS MAJO
[9]  
[Anonymous], ONLINE MENDELIAN INH
[10]   A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine [J].
Babaoglu, Ismail ;
Findik, Oguz ;
Ulker, Erkan .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) :3177-3183