COVID-19 chest X-ray image classification in the presence of noisy labels*

被引:7
作者
Ying, Xiaoqing [1 ]
Liu, Hao [1 ,2 ]
Huang, Rong [1 ]
机构
[1] Donghua Univ, Collage Informat Sci & Technol, Shanghai 201620, Peoples R China
[2] Minist Educ, Engn Res Ctr Digitized Text & Apparel Technol, Shanghai 201620, Peoples R China
基金
中国国家自然科学基金;
关键词
COVID-19; Chest X-ray image classification; Noisy label; Label recovery; Feature extraction; REGRESSION;
D O I
10.1016/j.displa.2023.102370
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Corona Virus Disease 2019 (COVID-19) has been declared a worldwide pandemic, and a key method for diagnosing COVID-19 is chest X-ray imaging. The application of convolutional neural network with medical imaging helps to diagnose the disease accurately, where the label quality plays an important role in the clas-sification problem of COVID-19 chest X-rays. However, most of the existing classification methods ignore the problem that the labels are hardly completely true and effective, and noisy labels lead to a significant degra-dation in the performance of image classification frameworks. In addition, due to the wide distribution of lesions and the large number of local features of COVID-19 chest X-ray images, existing label recovery algorithms have to face the bottleneck problem of the difficult reuse of noisy samples. Therefore, this paper introduces a general classification framework for COVID-19 chest X-ray images with noisy labels and proposes a noisy label recovery algorithm based on subset label iterative propagation and replacement (SLIPR). Specifically, the proposed al-gorithm first obtains random subsets of the samples multiple times. Then, it integrates several techniques such as principal component analysis, low-rank representation, neighborhood graph regularization, and k-nearest neighbor for feature extraction and image classification. Finally, multi-level weight distribution and replacement are performed on the labels to cleanse the noise. In addition, for the label-recovered dataset, high confidence samples are further selected as the training set to improve the stability and accuracy of the classification framework without affecting its inherent performance. In this paper, three typical datasets are chosen to conduct extensive experiments and comparisons of existing algorithms under different metrics. Experimental results on three publicly available COVID-19 chest X-ray image datasets show that the proposed algorithm can effectively recover noisy labels and improve the accuracy of the image classification framework by 18.9% on the Tawsifur dataset, 19.92% on the Skytells dataset, and 16.72% on the CXRs dataset. Compared to the state-of-the-art al-gorithms, the gain of classification accuracy of SLIPR on the three datasets can reach 8.67%-19.38%, and the proposed algorithm also has certain scalability while ensuring data integrity.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Correlation of Chest CT and RT-PCR Testing for Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases
    Ai, Tao
    Yang, Zhenlu
    Hou, Hongyan
    Zhan, Chenao
    Chen, Chong
    Lv, Wenzhi
    Tao, Qian
    Sun, Ziyong
    Xia, Liming
    [J]. RADIOLOGY, 2020, 296 (02) : E32 - E40
  • [2] Angluin D., 1988, Machine Learning, V2, P343, DOI 10.1007/BF00116829
  • [3] [Anonymous], 2003, NIPS
  • [4] [Anonymous], 2013, P IEEE INT C WORKSHO
  • [5] Belkin M, 2002, ADV NEUR IN, V14, P585
  • [6] Cai D, 2007, IEEE DATA MINING, P73, DOI 10.1109/ICDM.2007.89
  • [7] Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study
    Chen, Nanshan
    Zhou, Min
    Dong, Xuan
    Qu, Jieming
    Gong, Fengyun
    Han, Yang
    Qiu, Yang
    Wang, Jingli
    Liu, Ying
    Wei, Yuan
    Xia, Jia'an
    Yu, Ting
    Zhang, Xinxin
    Zhang, Li
    [J]. LANCET, 2020, 395 (10223) : 507 - 513
  • [8] Chen Pengfei, 2019, P MACHINE LEARNING R, V97
  • [9] Cheng Xue, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P579, DOI 10.1007/978-3-030-59725-2_56
  • [10] Can AI Help in Screening Viral and COVID-19 Pneumonia?
    Chowdhury, Muhammad E. H.
    Rahman, Tawsifur
    Khandakar, Amith
    Mazhar, Rashid
    Kadir, Muhammad Abdul
    Bin Mahbub, Zaid
    Islam, Khandakar Reajul
    Khan, Muhammad Salman
    Iqbal, Atif
    Al Emadi, Nasser
    Reaz, Mamun Bin Ibne
    Islam, Mohammad Tariqul
    [J]. IEEE ACCESS, 2020, 8 : 132665 - 132676