An efficient ensemble-based Machine Learning for breast cancer detection

被引:3
作者
Kapila, Ramdas [1 ]
Saleti, Sumalatha [1 ]
机构
[1] SRM Univ AP, Data Sci Res Lab, Comp Sci & Engn, Vijayawada 522502, Andhra Pradesh, India
关键词
Breast cancer; Machine learning; Anova; PCA; Stacking classifier; Completeness; SUPPORT VECTOR MACHINE; FEATURE-SELECTION; CLASSIFICATION; MODELS;
D O I
10.1016/j.bspc.2023.105269
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Breast cancer is a very severe type of cancer that often develops in breast cells. Attempting to develop an effective predictive model for breast cancer prognosis prediction is urgently needed despite substantial advancements in the management of symptomatic breast cancer over the past ten years. The precise prediction will offer numerous advantages, including the ability to diagnose cancer at an early stage and protect patients from needless medical care and related costs. In the medical field, recall is just as important as model accuracy. Even more crucially in the medical area, a model is not very good if its accuracy is high but its recall is low. To boost accuracy while still assigning equal weight to recall, we proposed a model that ensembles Feature Selection (FS), Feature Extraction (FE), and 5 Machine Learning (ML) models. There are three steps in our proposed model. The Correlation Coefficient (CC) and Anova (Anv) feature selection methodologies to choose the features in the first stage. Applying Uniform Manifold Approximation and Projection (UMAP), t-distributed Stochastic Neighbour Embedding (t-SNE), and Principal Component Analysis (PCA) to extract the features in the second stage without compromising the crucial information. With 5 ML models and ensemble models such as Voting Classifier (VC) and Stacking Classifier (SC) after selecting and extracting features from the dataset to predict the disease will be the last stage. The results show that the proposed model CC-Anv with PCA using a SC outperformed all the existing methodologies with 100% accuracy, precision, recall, and f1-score.
引用
收藏
页数:14
相关论文
共 64 条
  • [1] Predicting Breast Cancer Leveraging Supervised Machine Learning Techniques
    Aamir, Sanam
    Rahim, Aqsa
    Aamir, Zain
    Abbasi, Saadullah Farooq
    Khan, Muhammad Shahbaz
    Alhaisoni, Majed
    Khan, Muhammad Attique
    Khan, Khyber
    Ahmad, Jawad
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022
  • [2] Principal component analysis
    Abdi, Herve
    Williams, Lynne J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04): : 433 - 459
  • [3] Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
    Akbulut, Sami
    Cicek, Ipek Balikci
    Colak, Cemil
    [J]. HASEKI TIP BULTENI-MEDICAL BULLETIN OF HASEKI, 2022, 60 (03): : 196 - 203
  • [4] Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer
    Al-Azzam, Nosayba
    Shatnawi, Ibrahem
    [J]. ANNALS OF MEDICINE AND SURGERY, 2021, 62 : 53 - 64
  • [5] Alain D.-T., 2022, Wisconsin Breast Cancer Classification using Noisy Training Data Augmentation and Multi-Layer Perceptron (Mlp)
  • [6] Computer-aided detection of breast cancer on the Wisconsin dataset: An artificial neural networks approach
    Alshayeji, Mohammad H.
    Ellethy, Hanem
    Abed, Saed
    Gupta, Renu
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 71
  • [7] Alzubi A., 2022, Int. J. Electr. Comput. Eng., V12, P1498
  • [8] Amrane M., 2018, 2018 ELECT ELECT COM, P1
  • [9] New Sequential and Parallel Support Vector Machine with Grey Wolf Optimizer for Breast Cancer Diagnosis
    Badr, Elsayed
    Almotairi, Sultan
    Salam, Mustafa Abdul
    Ahmed, Hagar
    [J]. ALEXANDRIA ENGINEERING JOURNAL, 2022, 61 (03) : 2520 - 2534
  • [10] Benesty J., 2009, NOISE REDUCTION SPEE, V2, P1