Classification of imbalanced oral cancer image data from high-risk population

被引:14
作者
Song, Bofan [1 ]
Li, Shaobai [1 ]
Sunny, Sumsum [2 ]
Gurushanth, Keerthi [3 ]
Mendonca, Pramila [4 ]
Mukhia, Nirza [3 ]
Patrick, Sanjana [5 ]
Gurudath, Shubha [3 ]
Raghavan, Subhashini [3 ]
Tsusennaro, Imchen [6 ]
Leivon, Shirley T. [6 ]
Kolur, Trupti [4 ]
Shetty, Vivek [4 ]
Bushan, Vidya [4 ]
Ramesh, Rohan [6 ]
Peterson, Tyler [1 ]
Pillai, Vijay [4 ]
Wilder-Smith, Petra [7 ,8 ]
Sigamani, Alben [4 ]
Suresh, Amritha [2 ,4 ]
Kuriakose, Moni Abraham [9 ]
Birur, Praveen [3 ,5 ]
Liang, Rongguang [1 ]
机构
[1] Univ Arizona, Wyant Coll Opt Sci, Tucson, AZ 85721 USA
[2] Mazumdar Shaw Med Ctr, Bangalore, Karnataka, India
[3] KLE Soc Inst Dent Sci, Bangalore, Karnataka, India
[4] Mazumdar Shaw Med Fdn, Bangalore, Karnataka, India
[5] Biocon Fdn, Bangalore, Karnataka, India
[6] Christian Inst Hlth Sci & Res, Dimapur, India
[7] Univ Calif Irvine, Beckman Laser Inst, Irvine, CA USA
[8] Univ Calif Irvine, Med Clin, Irvine, CA USA
[9] Cochin Canc Res Ctr, Kochi, Kerala, India
基金
美国国家卫生研究院;
关键词
oral cancer; mobile screening device; imbalanced multi-class datasets; deep learning; ensemble learning;
D O I
10.1117/1.JBO.26.10.105001
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
. Significance: Early detection of oral cancer is vital for high-risk patients, and machine learning-based automatic classification is ideal for disease screening. However, current datasets collected from high-risk populations are unbalanced and often have detrimental effects on the performance of classification. Aim: To reduce the class bias caused by data imbalance. Approach: We collected 3851 polarized white light cheek mucosa images using our customized oral cancer screening device. We use weight balancing, data augmentation, undersampling, focal loss, and ensemble methods to improve the neural network performance of oral cancer image classification with the imbalanced multi-class datasets captured from high-risk populations during oral cancer screening in low-resource settings. Results: By applying both data-level and algorithm-level approaches to the deep learning training process, the performance of the minority classes, which were difficult to distinguish at the beginning, has been improved. The accuracy of "premalignancy" class is also increased, which is ideal for screening applications. Conclusions: Experimental results show that the class bias induced by imbalanced oral cancer image datasets could be reduced using both data- and algorithm-level methods. Our study may provide an important basis for helping understand the influence of unbalanced datasets on oral cancer deep learning classifiers and how to mitigate.
引用
收藏
页数:9
相关论文
共 25 条
  • [1] Boosted neural network ensemble classification for lung cancer disease diagnosis
    ALzubi, Jafar A.
    Bharathikannan, Balasubramaniyan
    Tanwar, Sudeep
    Manikandan, Ramachandran
    Khanna, Ashish
    Thaventhiran, Chandrasekar
    [J]. APPLIED SOFT COMPUTING, 2019, 80 : 579 - 591
  • [2] AN IMPROVED ALGORITHM FOR NEURAL-NETWORK CLASSIFICATION OF IMBALANCED TRAINING SETS
    ANAND, R
    MEHROTRA, KG
    MOHAN, CK
    RANKA, S
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1993, 4 (06): : 962 - 969
  • [3] Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1007/BF00058655
  • [4] Deep learning ensembles for melanoma recognition in dermoscopy images
    Codella, N. C. F.
    Nguyen, Q. -B.
    Pankanti, S.
    Gutman, D. A.
    Helba, B.
    Halpern, A. C.
    Smith, J. R.
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2017, 61 (4-5)
  • [5] Fatlawi H.K., 2017, Int. J. Comput. Tech, V4, P115
  • [6] Improving Accuracy of Lung Nodule Classification Using Deep Learning with Focal Loss
    Giang Son Tran
    Thi Phuong Nghiem
    Van Thi Nguyen
    Chi Mai Luong
    Burie, Jean-Christophe
    [J]. JOURNAL OF HEALTHCARE ENGINEERING, 2019, 2019
  • [7] Breast Cancer Multi-classification from Histopathological Images with Structured Deep Learning Model
    Han, Zhongyi
    Wei, Benzheng
    Zheng, Yuanjie
    Yin, Yilong
    Li, Kejian
    Li, Shuo
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [8] An enhanced deep learning approach for brain cancer MRI images classification using residual networks
    Ismael, Sarah Ali Abdelaziz
    Mohammed, Ammar
    Hefny, Hesham
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2020, 102
  • [9] Automatic diagnosis of imbalanced ophthalmic images using a cost-sensitive deep convolutional neural network
    Jiang, Jiewei
    Liu, Xiyang
    Zhang, Kai
    Long, Erping
    Wang, Liming
    Li, Wangting
    Liu, Lin
    Wang, Shuai
    Zhu, Mingmin
    Cui, Jiangtao
    Liu, Zhenzhen
    Lin, Zhuoling
    Li, Xiaoyan
    Chen, Jingjing
    Cao, Qianzhong
    Li, Jing
    Wu, Xiaohang
    Wang, Dongni
    Wang, Jinghui
    Lin, Haotian
    [J]. BIOMEDICAL ENGINEERING ONLINE, 2017, 16
  • [10] Automatic detection of oral cancer in smartphone-based images using deep learning for early diagnosis
    Lin, Huiping
    Chen, Hanshen
    Weng, Luxi
    Shao, Jiaqi
    Lin, Jun
    [J]. JOURNAL OF BIOMEDICAL OPTICS, 2021, 26 (08)