A Machine Learning-based Self-risk Assessment Technique for Cervical Cancer

被引:10
作者
Ramzan, Zeeshan [1 ]
Hassan, Muhammad Awais [1 ]
Asif, H. M. Shahzad [1 ]
Farooq, Amjad [1 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci, New Campus,POB 54890, Lahore, Pakistan
关键词
Cervical cancer; causes of cervical cancer; feature selection in machine learning; ensemble learning; cancer prediction using machine learning; AdaBoost; CLASSIFICATION; INTEGRATION;
D O I
10.2174/1574893615999200608130538
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Cervical cancer is a highly significant cause of mortality in developing countries, and it is one of the most prominent forms of cancer worldwide. Machine learning techniques have been proven more accurate for the identification of cervical cancer as compared to the manual screening methods like Pap smear and Liquid Cytology Based (LCB) tests. Objective: Primarily, these machine-learning techniques use the images of the cervix for cervical cancer risk analysis; in this article, demographic data and medical records of patients are used to identify major causes of cervical cancer. Furthermore, normal classification methods are used as a usual way of classification when the dataset is balanced as this dataset has abundant examples of negative cases as compared to positive cases On the other hand, traditional binary class classifiers are not sufficient to classify the examples of cervical cancer correctly. Methods: We identified the major causes of cervical cancer by employing multiple machine learning feature selection algorithms. After this selection, we trained different machine learning methods including Decision Trees (DTs), Support Vector Machines (SVMs) and Ensemble Learners using all features as well as these important features. Results and Conclusion: AdaBoost is able to classify instances into healthy and unhealthy classes of this unbalanced dataset with 96% accuracy. Based on this model and significant causes of cervical cancer, we aimed to develop a technique for self-risk assessment of cervical cancer, which women can use to know their chances of being infected from cervical cancer after answering some questions about their demographics and medical history.
引用
收藏
页码:315 / 332
页数:18
相关论文
共 33 条
  • [1] Principal component analysis
    Abdi, Herve
    Williams, Lynne J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04): : 433 - 459
  • [2] Cervical Cancer Diagnosis Using Random Forest Classifier With SMOTE and Feature Reduction Techniques
    Abdoh, Sherif F.
    Rizka, Mohamed Abo
    Maghraby, Fahima A.
    [J]. IEEE ACCESS, 2018, 6 : 59475 - 59485
  • [3] Al-Wesabi Y, 2018, CLASSIFICATION CERVI, P1455
  • [4] Alam TM, 2019, INT J ADV COMPUT SC, V10, P388
  • [5] Aldian RD, 2013, CISAK 2013 6 C IND S
  • [6] Ashraf FB, 2019, COMP ANAL PREDICTION, P1
  • [7] One-Class versus Binary Classification: Which and When?
    Bellinger, Colin
    Sharma, Shiven
    Japkowicz, Nathalie
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 102 - 106
  • [8] Ben-Hur A, 2010, METHODS MOL BIOL, V609, P223, DOI 10.1007/978-1-60327-241-4_13
  • [9] Ceylan Z., 2017, GRAPH MODELS, V21, P22
  • [10] Classification of Cervical Cancer using Artificial Neural Networks
    Devi, M. Anousouya
    Ravi, S.
    Vaishnavi, J.
    Punitha, S.
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 465 - 472