Bio inspired Ensemble Feature Selection (BEFS) Model with Machine Learning and Data Mining Algorithms for Disease Risk Prediction

被引:3
|
作者
Pasha, Syed Javeed [1 ]
Mohamed, E. Syed [2 ]
机构
[1] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Comp Applicat, Chennai, Tamil Nadu, India
[2] BS Abdur Rahman Crescent Inst Sci & Technol, Dept Comp Sci, Chennai, Tamil Nadu, India
来源
2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA) | 2019年
关键词
Bio inspired ensemble feature selection (BEFS) model; machine learning; data mining; feature selection; health care; disease risk prediction; breast cancer risk prediction; genetic algorithm; random forest; logistic regression; BREAST-CANCER; DIAGNOSIS;
D O I
10.1109/iccubea47591.2019.9129304
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Use of machine learning (ML) and data mining (DM) algorithms has surfaced more often in the recent years for disease risk prediction problems in the healthcare communities. Several traditional feature selection models are combined with the DM and ML algorithms to improve accuracy of the disease risk prediction. In this study, a new Bio-inspired Ensemble Feature Selection (BEFS) model is introduced which is applied with the DM and ML algorithms. In the BEFS model, the most relevant and highly contributing features in the prediction are determined with a bio-inspired algorithm i.e., genetic algorithm, and an ensemble algorithm i.e., random forest algorithm. These important features obtained from the proposed model are then combined in various combinations and applied with the DM and ML algorithms, here logistic regression (LR) and random forest (RF), and the results obtained are promising. The experiment is executed using the famous ML language R. To accomplish this objective, the Breast Cancer Wisconsin (Diagnostic) dataset of UCI (University of California, Irvine) ML repository is utilized. In the experimental outcomes, the highest accuracy attained with the BEFS model is 96.49%, the AUC (Area Under Curve) achieved is 96%, and the sensitivity is 98.11%. These results, which greatly improve the disease risk prediction, are higher than several other existing works, while utilizing only six most relevant features out of the thirty two features of the dataset.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Citrus huanglongbing detection: A hyperspectral data-driven model integrating feature band selection with machine learning algorithms
    Yan, Kangting
    Song, Xiaobing
    Yang, Jing
    Xiao, Junqi
    Xu, Xidan
    Guo, Jun
    Zhu, Hongyun
    Lan, Yubin
    Zhang, Yali
    CROP PROTECTION, 2025, 188
  • [42] Cooperative prediction method of gas emission from mining face based on feature selection and machine learning
    Zhou, Jie
    Lin, Haifei
    Jin, Hongwei
    Li, Shugang
    Yan, Zhenguo
    Huang, Shiyin
    INTERNATIONAL JOURNAL OF COAL SCIENCE & TECHNOLOGY, 2022, 9 (01)
  • [43] ECG data analysis and heart disease prediction using machine learning algorithms
    Thithi, Sushimita Roy
    Akfar, Afifa
    Aleem, Fahimul
    Chakrabarty, Amitabha
    PROCEEDINGS OF 2019 IEEE REGION 10 SYMPOSIUM (TENSYMP), 2019, : 819 - 824
  • [44] New hybrid data mining model for credit scoring based on feature selection algorithm and ensemble classifiers
    Nalic, Jasmina
    Martinovic, Goran
    Zagar, Drago
    ADVANCED ENGINEERING INFORMATICS, 2020, 45 (45)
  • [45] Feature Selection Based Machine Learning to Improve Prediction of Parkinson Disease
    Nahar, Nazmun
    Ara, Ferdous
    Neloy, Md Arif Istiek
    Biswas, Anik
    Hossain, Mohammad Shahadat
    Andersson, Karl
    BRAIN INFORMATICS, BI 2021, 2021, 12960 : 496 - 508
  • [46] Prediction of Diabetes Using Data Mining and Machine Learning Algorithms: A Cross-Sectional Study
    Shojaee-Mend, Hassan
    Velayati, Farnia
    Tayefi, Batool
    Babaee, Ebrahim
    HEALTHCARE INFORMATICS RESEARCH, 2024, 30 (01) : 73 - 82
  • [47] A Review: Machine Learning and Data Mining Approaches for Cardiovascular Disease Diagnosis and Prediction
    Rao G.S.
    Muneeswari G.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2024, 10
  • [48] Enhancing Parkinson's Disease Prediction Using Machine Learning and Feature Selection Methods
    Saeed, Faisal
    Al-Sarem, Mohammad
    Al-Mohaimeed, Muhannad
    Emara, Abdelhamid
    Boulila, Wadii
    Alasli, Mohammed
    Ghabban, Fahad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (03): : 5639 - 5657
  • [49] Cooperative prediction method of gas emission from mining face based on feature selection and machine learning
    Jie Zhou
    Haifei Lin
    Hongwei Jin
    Shugang Li
    Zhenguo Yan
    Shiyin Huang
    International Journal of Coal Science & Technology, 2022, 9
  • [50] Investigation of machine learning algorithms on heart disease through dominant feature detection and feature selection
    Fuat Türk
    Signal, Image and Video Processing, 2024, 18 : 3943 - 3955