Variational Autoencoder Based Imbalanced COVID-19 Detection Using Chest X-Ray Images

被引:5
作者
Chatterjee, Sankhadeep [1 ]
Maity, Soumyajit [2 ]
Bhattacharjee, Mayukh [2 ]
Banerjee, Soumen [3 ]
Das, Asit Kumar [1 ]
Ding, Weiping [4 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur, W Bengal, India
[2] Univ Engn & Management, Dept Comp Sci & Engn, Kolkata, W Bengal, India
[3] Budge Budge Inst Technol, Dept Elect & Commun Engn, Kolkata 700137, W Bengal, India
[4] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Jiangsu, Peoples R China
关键词
COVID-19; Class imbalance; Variational autoencoder; Oversampling; Undersampling; BORDERLINE-SMOTE; SAMPLING METHOD; CLASSIFICATION; CANCER; PERFORMANCE; PREDICTION;
D O I
10.1007/s00354-022-00194-y
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Early and fast detection of disease is essential for the fight against COVID-19 pandemic. Researchers have focused on developing robust and cost-effective detection methods using Deep learning based chest X-Ray image processing. However, such prediction models are often not well suited to address the challenge of highly imabalanced datasets. The current work is an attempt to address the issue by utilizing unsupervised Variational Auto Encoders (VAEs). Firstly, chest X-Ray images are converted to a latent space by learning the most important features using VAEs. Secondly, a wide range of well established data resampling techniques are used to balance the preexisting imbalanced classes in the latent vector form of the dataset. Finally, the modified dataset in the new feature space is used to train well known classification models to classify chest X-Ray images into three different classes viz., "COVID-19", "Pneumonia", and "Normal". In order to capture the quality of resampling methods, 10-folds cross validation technique is applied on the dataset. Extensive experimental analysis have been carried out and results so obtained indicate significant improvement in COVID-19 detection using the proposed VAE based method. Furthermore, the ingenuity of the results have been established by performing Wilcoxon rank test with 95% level of significance.
引用
收藏
页码:25 / 60
页数:36
相关论文
共 70 条
  • [1] Deep Ensemble Model for Classification of Novel Coronavirus in Chest X-Ray Images
    Ahmad, Fareed
    Farooq, Amjad
    Ghani, Muhammad Usman
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [2] An J., 2015, SPECIAL LECT IE, V2, ppp 1
  • [3] StackNet-DenVIS: a multi-layer perceptron stacked ensembling approach for COVID-19 detection using X-ray images
    Autee, Pratik
    Bagwe, Sagar
    Shah, Vimal
    Srivastava, Kriti
    [J]. PHYSICAL AND ENGINEERING SCIENCES IN MEDICINE, 2020, 43 (04) : 1399 - 1414
  • [4] Babaeizadeh M., 2017, ARXIV
  • [5] Synthetic minority oversampling in addressing imbalanced sarcasm detection in social media
    Banerjee, Arghasree
    Bhattacharjee, Mayukh
    Ghosh, Kushankur
    Chatterjee, Sankhadeep
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35995 - 36031
  • [6] Banik D., 2021, Data preprocessing, active learning, and cost perceptive approaches for resolving data imbalance, V1, P66, DOI DOI 10.4018/978-1-7998-7371-6.CH004
  • [7] Bank D., 2020, ARXIV
  • [8] Batista G, 2003, WOB, P10
  • [9] Batista GE., 2004, ACM SIGKDD EXPLOR NE, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735, 10.1145/1007730.1007735.2]
  • [10] Baur C, 2020, I S BIOMED IMAGING, P1905, DOI [10.1109/isbi45749.2020.9098686, 10.1109/ISBI45749.2020.9098686]