A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction

被引:269
|
作者
Pudjihartono, Nicholas [1 ]
Fadason, Tayaza [1 ,2 ]
Kempa-Liehr, Andreas W. [3 ]
O'Sullivan, Justin M. [1 ,2 ,4 ,5 ,6 ]
机构
[1] Univ Auckland, Liggins Inst, Auckland, New Zealand
[2] Maurice Wilkins Ctr Mol Biodiscovery, Auckland, New Zealand
[3] Univ Auckland, Dept Engn Sci, Auckland, New Zealand
[4] Univ Southampton, MRC Lifecourse Epidemiol Unit, Southampton, England
[5] ASTAR, Singapore Inst Clin Sci, Singapore, Singapore
[6] Garvan Inst Med Res, Australian Parkinsons Mission, Sydney, NSW, Australia
来源
FRONTIERS IN BIOINFORMATICS | 2022年 / 2卷
关键词
machine learing; feature selection (FS); risk prediction; disease risk prediction; statistical approaches; GENOME-WIDE ASSOCIATION; ROBUST FEATURE-SELECTION; FALSE DISCOVERY RATE; MUTUAL INFORMATION; RANDOM FORESTS; GENE; RELEVANCE; LOCI; GWAS; DIMENSIONALITY;
D O I
10.3389/fbinf.2022.927312
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Machine learning has shown utility in detecting patterns within large, unstructured, and complex datasets. One of the promising applications of machine learning is in precision medicine, where disease risk is predicted using patient genetic data. However, creating an accurate prediction model based on genotype data remains challenging due to the so-called "curse of dimensionality" (i.e., extensively larger number of features compared to the number of samples). Therefore, the generalizability of machine learning models benefits from feature selection, which aims to extract only the most "informative" features and remove noisy "non-informative," irrelevant and redundant features. In this article, we provide a general overview of the different feature selection methods, their advantages, disadvantages, and use cases, focusing on the detection of relevant features (i.e., SNPs) for disease risk prediction.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Obesity disease risk prediction using machine learning
    Dutta, Raja Ram
    Mukherjee, Indrajit
    Chakraborty, Chinmay
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 19 (4) : 709 - 718
  • [32] Effective Feature Engineering Technique for Heart Disease Prediction With Machine Learning
    Qadri, Azam Mehmood
    Raza, Ali
    Munir, Kashif
    Almutairi, Mubarak S.
    IEEE ACCESS, 2023, 11 : 56214 - 56224
  • [33] Machine learning-based risk prediction model construction of difficult weaning in ICU patients with mechanical ventilation
    Xu, Huimei
    Ma, Yanyan
    Zhuang, Yan
    Zheng, Yanqi
    Du, Zhiqiang
    Zhou, Xuemei
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [34] Review of swarm intelligence-based feature selection methods
    Rostami, Mehrdad
    Berahmand, Kamal
    Nasiri, Elahe
    Forouzande, Saman
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 100
  • [35] Advances, challenges, and future research needs in machine learning-based crash prediction models: A systematic review
    Ali, Yasir
    Hussain, Fizza
    Haque, Md Mazharul
    ACCIDENT ANALYSIS AND PREVENTION, 2024, 194
  • [36] Prediction of Parkinson's Disease Using Machine Learning Methods
    Zhang, Jiayu
    Zhou, Wenchao
    Yu, Hongmei
    Wang, Tong
    Wang, Xiaqiong
    Liu, Long
    Wen, Yalu
    BIOMOLECULES, 2023, 13 (12)
  • [37] A review of unsupervised feature selection methods
    Solorio-Fernandez, Saul
    Carrasco-Ochoa, J. Ariel
    Martinez-Trinidad, Jose Fco.
    ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (02) : 907 - 948
  • [38] Cybersecurity and Risk Prediction Based on Machine Learning Algorithms
    Yang, Haoliang
    Zhu, Jianan
    Li, Jiaqing
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [39] Using machine learning-based algorithms to construct cardiovascular risk prediction models for Taiwanese adults based on traditional and novel risk factors
    Cheng, Chien-Hsiang
    Lee, Bor-Jen
    Nfor, Oswald Ndi
    Hsiao, Chih-Hsuan
    Huang, Yi-Chia
    Liaw, Yung-Po
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [40] Cardiovascular Disease Risk Prediction with Supervised Machine Learning Techniques
    Dritsas, Elias
    Alexiou, Sotiris
    Moustakas, Konstantinos
    ICT4AWE: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES FOR AGEING WELL AND E-HEALTH, 2022, : 315 - 321