Data Augmentation using GAN for Sound based COVID 19 Diagnosis

被引:5
作者
Yella, Nishant [1 ]
Rajan, Bina [2 ]
机构
[1] Malla Reddy Engn Coll, Dept Comp Sci & Engn, Hyderabad, India
[2] Sai Vidya Inst Technol, Dept Elect & Commun Engn, Bengaluru, India
来源
PROCEEDINGS OF THE 11TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS (IDAACS'2021), VOL 2 | 2021年
关键词
Generative adversarial networks; WaveGAN; Sound-based diagnosis;
D O I
10.1109/IDAACS53288.2021.9660990
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The COVID 19 virus has been mutating at a rapid phase, due to which the golden standard of testing reverse transcription-polymerase chain reaction (RT-PCR) has been producing false negatives at an alarming rate. The inability of the test to detect the mutated strain of the COVID 19 virus using RT-PCR has made it very difficult for diagnosis and hence an alternative solution is needed. Sound-based diagnosis is one effective alternative diagnosis tool. The lack of a large dataset is one challenging aspect for the development of a sound-based diagnosis tool. We look forward to using dataset augmentation as a very effective technique for a selected classification problem: visual perception and also speech recognition tasks. The Generative Adversarial Networks (GANs) have been showing high success for applications in terms of synthesizing realistic images, they're seen rarely in audio generation-based applications Due to the lack of data sets available to develop an accurate model in this paper we showcase an application of WaveGAN, which is a variant of GAN which helps in raw audio synthesis during a supervised setting for the classification task, by developing a method showcasing one of the approaches for augmenting speech datasets by using Generative adversarial networks (GANs). We deploy the WaveGAN on the existing data sets collected from open-source collections to develop synthetic, larger data set to build an accurate sound-based diagnosis tool.
引用
收藏
页码:606 / 609
页数:4
相关论文
共 18 条
[1]  
Abdel-Hamid O, 2012, INT CONF ACOUST SPEE, P4277, DOI 10.1109/ICASSP.2012.6288864
[2]  
Anfilets S., INT J COMPUTING, V19, P70, DOI [10.47839/ijc.19.1.1695, DOI 10.47839/IJC.19.1.1695]
[3]   Technique of learning rate estimation for efficient training of MLP [J].
Golovko, V ;
Savitsky, Y ;
Laopoulos, T ;
Sachenko, A ;
Grandinetti, L .
IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL I, 2000, :323-328
[4]  
Goodfellow I, 2017, Arxiv, DOI arXiv:1701.00160
[5]  
Jaitly Navdeep., 2013, Vocal tract length perturbation (vtlp) improves speech recognition
[6]   Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma [J].
Koizumi, Yuma ;
Saito, Shoichiro ;
Uematsu, Hisashi ;
Kawachi, Yuta ;
Harada, Noboru .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) :212-224
[7]  
Komar M, 2018, 2018 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA STREAM MINING & PROCESSING (DSMP), P102, DOI 10.1109/DSMP.2018.8478621
[8]  
Lange B., EARLY DETECTION COVI
[9]  
LeCun Y., 1995, HDB BRAIN THEORY NEU, V3361, P1995, DOI 10.5555/303568.303704
[10]  
Mehri S, 2017, Arxiv, DOI arXiv:1612.07837