Machine Learning-Based Decision Support System for Early Detection of Breast Cancer

被引:3
作者
Li, Mochen [1 ]
Nanda, Gaurav [1 ]
Chhajedss, Santosh [2 ]
Sundararajan, Raji [1 ]
机构
[1] Purdue Univ, Sch Engn Technol, Grant St, W Lafayette, IN 47907 USA
[2] METs Inst Pharm, Bhujbal Knowledge City, Nasik, Maharashtra, India
关键词
Breast cancer; Data analysis; Machine learning; Feature selection; Decision support system; PREDICTION;
D O I
10.5530/ijper.54.3s.171
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Background: Breast cancer is one of the leading causes of death of women in the United States and also one of the most malignant cancer among women worldwide. Early, more accurate detection of breast cancer enables extended longevity at a reduced cost. Towards this, analyzing the available big data using tools, such as Machine learning-based decision support systems can improve the speed and accuracy of early detection of breast cancer. In this paper, we examined the prediction performance of various state-of-the-art machine learning models and a decision support system based on these models that provided the predicted category along with a prediction confidence measure. Methods: The various machine learning (ML) algorithms applied include Decision Tree, Naive Bayes, k-Nearest Neighbors (kNN) and Support Vector Machine (SVM). We also analyzed the effect of multiple feature selection approaches on the prediction performance. We used the Breast Cancer Wisconsin Dataset from Wisconsin Prognostic Breast Cancer (WPBC) with 569 digitized images of a fine needle aspirate (FNA) of breast mass and 10 realvalued feature information. The performance of the ML model was evaluated using the ten-fold cross-validation approach and also on a prediction set comprising of 20% data with the models trained on remaining 80% data. Sensitivity and Specificity were used as the primary measures of performance. Results: Among all five machine learning methods, SVM had the best performance. Except for the kNN algorithm, the performance of the other three algorithms, Logistic Regressions, Naive Bayes and Decision Trees, were also quite close to SVM. The prediction performance of the decision support system was better than any individual ML model where the prediction confidence was "High" or "Medium". Conclusion: We found that feature selection improved the performance and computation cost for all ML models. By building the ML-based decision support system with the optimal feature subset, the prediction performance for breast cancer can be improved to 96% which means it can provide powerful assistance to doctors and patinets. On the other hand, as the size of the data set increases, the processing of data with a lot of features can increase the computation cost as well as the possibility of classification errors.
引用
收藏
页码:S705 / S715
页数:11
相关论文
共 50 条
[31]   Machine learning-based clinical decision support for infection risk prediction [J].
Feng, Ting ;
Noren, David P. ;
Kulkarni, Chaitanya ;
Mariani, Sara ;
Zhao, Claire ;
Ghosh, Erina ;
Swearingen, Dennis ;
Frassica, Joseph ;
McFarlane, Daniel ;
Conroy, Bryan .
FRONTIERS IN MEDICINE, 2023, 10
[32]   Decision support detection system for lung nodule abnormalities based on machine learning algorithms [J].
Alsallal, Muna ;
Sharif, Mhd Saeed ;
Hadi, Bydaa ;
Albadry, Ruwaida .
JOURNAL OF CONTEMPORARY MEDICAL SCIENCES, 2019, 5 (03) :165-169
[33]   Deep learning-based decision support system for cervical cancer identification in liquid-based cytology pap smears [J].
Atteia, Ghada ;
Alabdulhafith, Maali ;
Abdallah, Hanaa A. ;
Samee, Nagwan Abdel ;
Alayed, Walaa .
TECHNOLOGY AND HEALTH CARE, 2025,
[34]   Advances in and the Applicability of Machine Learning-Based Screening and Early Detection Approaches for Cancer: A Primer [J].
Benning, Leo ;
Peintner, Andreas ;
Peintner, Lukas .
CANCERS, 2022, 14 (03)
[35]   Machine Learning-based Early Detection and Prognosis of the Covid-19 Pandemic [J].
Santhakumari, Ajitha ;
Shilpa, R. ;
Abdulwahab, Hudhaifa Mohammed .
JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2023, 17 (02) :214-230
[36]   Feature Selection For Machine Learning-Based Early Detection of Distributed Cyber Attacks [J].
Feng, Yaokai ;
Akiyama, Hitoshi ;
Lu, Liang ;
Sakurai, Kouichi .
2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, :173-180
[37]   Usefulness of machine learning and deep learning approaches in screening and early detection of breast cancer [J].
Ghorbian, Mohsen ;
Ghorbian, Saeid .
HELIYON, 2023, 9 (12)
[38]   Single Vesicle Surface Protein Profiling and Machine Learning-Based Dual Image Analysis for Breast Cancer Detection [J].
Taylor, Mitchell Lee ;
Alle, Madhusudhan ;
Wilson Jr, Raymond ;
Rodriguez-Nieves, Alberto ;
Lutey, Mitchell A. ;
Slavney, William F. ;
Stewart, Jacob ;
Williams, Hiyab ;
Amrhein, Kristopher ;
Zhang, Hongmei ;
Wang, Yongmei ;
Hoang, Thang Ba ;
Huang, Xiaohua .
NANOMATERIALS, 2024, 14 (21)
[39]   Machine learning-based test smell detection [J].
Pontillo, Valeria ;
d'Aragona, Dario Amoroso ;
Pecorelli, Fabiano ;
Di Nucci, Dario ;
Ferrucci, Filomena ;
Palomba, Fabio .
EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (02)
[40]   Machine Learning Based Early Detection System of Cardiac Arrest [J].
Liu, Ji-Han ;
Chang, Hsiao-Ko ;
Wu, Cheng-Tse ;
Lim, Wee Shin ;
Wang, Hui-Chih ;
Jang, Jyh-Shing Roger .
2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2019,