An enhanced soft-computing based strategy for efficient feature selection for timely breast cancer prediction: Wisconsin Diagnostic Breast Cancer dataset case

被引:14
作者
Singh, Law Kumar [1 ]
Khanna, Munish [2 ]
Singh, Rekha [3 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Galgotias Univ, Sch Comp Sci & Engn, Plot 2,Sect 17-A,Yamuna Expressway, Greater Noida 203201, India
[3] Uttar Pradesh Rajarshi Tandon Open Univ, Dept Phys, Prayagraj, Uttar Pradesh, India
关键词
Soft-computing based feature selection; Emperor penguin optimization; Machine learning; Computer aided diagnosis; Breast cancer prediction; Medical data; PARTICLE SWARM OPTIMIZATION; HISTOPATHOLOGY IMAGE-ANALYSIS; SUPPORT VECTOR MACHINES; SYSTEM; EPIDEMIOLOGY; ALGORITHM; MODELS; WOLF;
D O I
10.1007/s11042-024-18473-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When contemplating the improvement of overall performance in machine learning (ML) models, a critical strategy for optimizing data preparation is feature selection (FS). There has been a significant rise in the popularity of metaheuristic FS algorithms in recent times. This can be attributed to their proficiency in accurately identifying and selecting the most relevant features for ML tasks. This study presents three feature selection strategies that utilize metaheuristic algorithms. The methodologies mentioned include the Gravitational Search Optimization Algorithm (GSA), Emperor Penguin Optimization (EPO), and a hybrid approach of GSA and EPO referred to as hGSAEPO. Previous research has explored the use of baseline algorithms for feature selection in various ML tasks. However, there is a lack of investigation regarding their application specifically in breast cancer(BC) classification. A combination of these two has been utilized for the first occasion. The purpose of selecting BC as the study of investigation is due to the reason that this illness is recognized as the second most prevalent cause of mortality in the female population. If the condition is detected in its initial phases, it can be remedied and can assist individuals in evading superfluous medical processes. The procedure of selecting relevant features holds significant importance in the purpose of predicting ailments like BC. The current research presents an innovative methodology that employs three soft-computing algorithms, EPO, GSA, and their proposed hybrid hGSAEPO to efficiently identify significant features while concurrently decreasing the occurrence of irrelevant ones, simplifying overall complexity and enhancing the accuracy. The utilization of these soft computing methodologies and six ML classifiers presents a viable framework for prognostic research through the classification of data instances on Wisconsin Diagnostic Breast Cancer (WDBC). The experimental findings of eight experiments conducted suggest that the suggested approach exhibits exceptional performance in the context of binary classification for BC by computing astounding results like precision of 0.9800, sensitivity of 0.9700, specificity of 0.9887, F1-score of 0.9539, area under the curve(AUC) surpassing 0.998, with an accuracy of 98.31%. We achieved our objectives by presenting a dependable clinical prediction system for healthcare professionals for efficient diagnosis.
引用
收藏
页码:76607 / 76672
页数:66
相关论文
共 82 条
[1]   A new nested ensemble technique for automated diagnosis of breast cancer [J].
Abdar, Moloud ;
Zomorodi-Moghadam, Mariam ;
Zhou, Xujuan ;
Gururajan, Raj ;
Tao, Xiaohui ;
Barua, Prabal D. ;
Gururajan, Rashmi .
PATTERN RECOGNITION LETTERS, 2020, 132 :123-131
[2]  
Agustian F, 2020, 2020 8 INT C CYB IT, P1
[3]   A Comparative Analysis of Breast Cancer Detection and Diagnosis Using Data Visualization and Machine Learning Applications [J].
Ak, Muhammet Fatih .
HEALTHCARE, 2020, 8 (02)
[4]   Support vector machines combined with feature selection for breast cancer diagnosis [J].
Akay, Mehmet Fatih .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :3240-3247
[5]   Breast cancer diagnosis using GA feature selection and Rotation Forest [J].
Alickovic, Emina ;
Subasi, Abdulhamit .
NEURAL COMPUTING & APPLICATIONS, 2017, 28 (04) :753-763
[6]  
[Anonymous], 2015, International Journal of Scientific and Innovative Mathematical Research
[7]   New Sequential and Parallel Support Vector Machine with Grey Wolf Optimizer for Breast Cancer Diagnosis [J].
Badr, Elsayed ;
Almotairi, Sultan ;
Salam, Mustafa Abdul ;
Ahmed, Hagar .
ALEXANDRIA ENGINEERING JOURNAL, 2022, 61 (03) :2520-2534
[8]   Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus [J].
Barakat, Nahla H. ;
Bradley, Andrew P. ;
Barakat, Mohamed Nabil H. .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2010, 14 (04) :1114-1120
[9]   Breast cancer diagnosis using Genetically Optimized Neural Network model [J].
Bhardwaj, Arpit ;
Tiwari, Aruna .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (10) :4611-4620
[10]   Improving mass discrimination in mammogram-CAD system using texture information and super-resolution reconstruction [J].
Boudraa, Sawsen ;
Melouah, Ahlem ;
Merouani, Hayet Farida .
EVOLVING SYSTEMS, 2020, 11 (04) :697-706