Machine Learning based Intelligent System for Breast Cancer Prediction (MLISBCP)

被引:8
作者
Das, Akhil Kumar [1 ]
Biswas, Saroj Kr. [2 ]
Mandal, Ardhendu [3 ]
Bhattacharya, Arijit [1 ]
Sanyal, Saptarsi [4 ]
机构
[1] Gour Mahavidyalaya, Dept Comp Sci, Malda 732142, West Bengal, India
[2] Natl Inst Technol, Dept Comp Sci & Engn, Silchar 788010, India
[3] Univ North Bengal, Dept Comp Sci & Technol, Darjeeling 734013, West Bengal, India
[4] Natl Inst Technol, Dept Comp Sci & Engn, Silchar 788010, Assam, India
关键词
Breast Cancer; Machine Learning; Boruta; K-Means SMOTE; FEATURE-SELECTION; COMBINING SMOTE; K-MEANS; HYBRID; ALGORITHM;
D O I
10.1016/j.eswa.2023.122673
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Risks of death from Breast Cancer (BC) are drastically rising in recent years. The diagnosis of breast cancer is time-consuming due to the limited availability of diagnostic systems such as dynamic MRI, X-rays etc. Early detection and diagnosis of breast cancer significantly impacts life expectancy as current medical technologies are not advanced enough to treat patients in later stages effectively. Even though researchers have created many expert systems for early detection of BC such as WNBC, AR + NN system, AdaBoost ELM etc., but still most expert systems frequently lack adequate handling of the class imbalance problem, proper data pre-processing, and systematic feature selection. To overcome these limitations, this work proposes an expert system named "Machine Learning Based Intelligent System for Breast Cancer Prediction (MLISBCP)" for better prediction of breast cancer using machine learning analytics. The suggested system utilises the 'K-Means SMOTE' oversampling method to handle the class imbalance problem and 'Boruta' feature selection technique to select the most relevant features of the BC dataset. To understand the effectiveness of the proposed model - MLISBCP, its performance is compared with various single classifier based models, ensemble models and various models present in literature in terms of performance metrics- accuracy, precision, recall, F1-score and RoC AUC Score. The results reveal that the MLISBCP obtained the highest accuracy of 97.53 % with respect to existing models present in the literature.
引用
收藏
页数:13
相关论文
共 80 条
[1]  
Abdulrahman BF, 2022, QALAAI ZANIST JOURNAL, V7, P878, DOI [10.25212/lfu.qzj.7.1.34, DOI 10.25212/LFU.QZJ.7.1.34]
[2]   Breast Cancer Statistics: Recent Trends [J].
Ahmad, Aamir .
BREAST CANCER METASTASIS AND DRUG RESISTANCE: CHALLENGES AND PROGRESS, 2ND EDITION, 2019, 1152 :1-7
[3]   Gully Erosion Susceptibility Assessment in the Kondoran Watershed Using Machine Learning Algorithms and the Boruta Feature Selection [J].
Ahmadpour, Hamed ;
Bazrafshan, Ommolbanin ;
Rafiei-Sardooi, Elham ;
Zamani, Hossein ;
Panagopoulos, Thomas .
SUSTAINABILITY, 2021, 13 (18)
[4]   Awareness and current knowledge of breast cancer [J].
Akram, Muhammad ;
Iqbal, Mehwish ;
Daniyal, Muhammad ;
Khan, Asmat Ullah .
BIOLOGICAL RESEARCH, 2017, 50
[5]  
Al Helal M, 2019, 2019 INT C ELECT COM, P1, DOI [10.1109/ECACE.2019.8679413, DOI 10.1109/ECACE.2019.8679413]
[6]  
Alaybeyoglu A., 2018, The Eurasia Proceedings of Science Technology Engineering and Mathematics, V19, P345
[7]  
Ali J., 2012, INT J COMPUT SCI ISS, V9, P272
[8]  
[Anonymous], BREAST CANC FACTS FI
[9]  
[Anonymous], U.S
[10]  
Apoorva V, 2021, 3 INT C INTEGRATED I, DOI [10.2991/ahis.k.210913.043, DOI 10.2991/AHIS.K.210913.043]