Preemptive Diagnosis of Colorectal Cancer Using Computational Intelligence Techniques

被引:0
作者
Olatunji, Sunday O. [1 ]
Aleissa, Shahd [1 ]
Alakkas, Maryam [1 ]
Albugeaey, Zainab [1 ]
Alshelaly, Hneen [1 ]
Alzubaidi, Thuraya [1 ]
Ahmed, Mohammed Imran Basheer [1 ]
Farooqui, Mehwash [1 ]
机构
[1] Imam Abdulrahman Bin Faisal Univ, Coll Comp Sci & Informat Technol, Dammam, Saudi Arabia
来源
PROCEEDINGS 2024 SEVENTH INTERNATIONAL WOMEN IN DATA SCIENCE CONFERENCE AT PRINCE SULTAN UNIVERSITY, WIDS-PSU 2024 | 2024年
关键词
Colorectal Cancer; Machine Learning; !text type='Python']Python[!/text; SMOTE-ENN; Histogram-based Gradient Boosting (HGB); Adaptive Boosting (AdaBoost);
D O I
10.1109/WiDS-PSU61003.2024.00043
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the most common cancers in the world is colorectal cancer. Due to this increasing global concern, the focus of this study was to enhance the predictive power of Machine Learning (ML) techniques for making preemptive diagnoses of colorectal cancer. This paper uses an online dataset to develop models for colorectal cancer and employs three ensemble ML algorithms, Histogram-based Gradient Boosting (HGB), Adaptive Boosting (AdaBoost), and Extra Trees (ET). The dataset comprised 115 instances, each with 11 features, after it underwent rigorous pre- processing including, imputing null values and one-hot encoding, followed by the utilization of the Synthetic Minority Over-sampling Technique accompanied by Edited Nearest Neighbors (SMOTE- ENN). Another key aspect of this study is the optimization of these models, achieved using GridSearchCV coupled with 5-fold-cross validation to identify the values of the most effective hyperparameters. A notable breakthrough is the enhanced performance of the HGB and Adaboost models in comparison with previous studies, as both models achieved a perfect 100% accuracy, precision, recall, and F1 score higher than previous studies where the F1 was about 88%. This outcome demonstrates the efficiency of using computational intelligence, specifically ML, in revolutionizing the preemptive diagnosis of colorectal cancer, offering a promising platform for future research and clinical application.
引用
收藏
页码:162 / 167
页数:6
相关论文
共 31 条
  • [1] Alassaf RA, 2018, IEEE INT CONF INNOV, P99, DOI 10.1109/INNOVATIONS.2018.8606040
  • [2] Almutairi M, 2019, 2019 2ND INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS & INFORMATION SECURITY (ICCAIS)
  • [3] [Anonymous], 2015, Risk Factors: Chronic Inflammation - NCI
  • [4] [Anonymous], Colon cancer Symptoms and causes
  • [5] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [6] Predicting the predisposition to colorectal cancer based on SNP profiles of immune phenotypes using supervised learning models
    Cakmak, Ali
    Ayaz, Huzeyfe
    Arikan, Soykan
    Ibrahimzada, Ali R.
    Demirkol, Seyda
    Sonmez, Dilara
    Hakan, Mehmet T.
    Surmen, Saime T.
    Horozoglu, Cem
    Dogan, Mehmet B.
    Kucukhuseyin, Ozlem
    Cacina, Canan
    Kiran, Bayram
    Zeybek, Umit
    Baysan, Mehmet
    Yaylim, Ilhan
    [J]. MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (01) : 243 - 258
  • [7] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [8] evidentlyai, Accuracy vs. precision vs. recall in machine learning: what's the difference?
  • [9] Fahami Mohammad Amin, 2021, Informatics in Medicine Unlocked, V24, P481, DOI 10.1016/j.imu.2021.100605
  • [10] Greedy function approximation: A gradient boosting machine
    Friedman, JH
    [J]. ANNALS OF STATISTICS, 2001, 29 (05) : 1189 - 1232