Data Augmentation Using Generative Adversarial Network for Environmental Sound Classification

被引:23
作者
Madhu, Aswathy [1 ]
Kumaraswamy, Suresh [2 ]
机构
[1] Coll Engn, Dept ECE, Thiruvananthapuram, Kerala, India
[2] Govt Engn Coll, Dept ECE, Barton Hill, Thiruvananthapuram, Kerala, India
来源
2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2019年
关键词
data augmentation; generative adversarial network; deep learning; environmental sound classification; RECOGNITION;
D O I
10.23919/eusipco.2019.8902819
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Various types of deep learning architecture have been steadily gaining impetus for automatic environmental sound classification. However, the relative paucity of publicly accessible dataset hinders any further improvement in this direction. This work has two principal contributions. First, we put forward a deep learning framework employing convolutional neural network for automatic environmental sound classification. Second, we investigate the possibility of generating synthetic data using data augmentation. We suggest a novel technique for audio data augmentation using a generative adversarial network (GAN). The proposed model along with data augmentation is assessed on the UrbanSound8K dataset. The results authenticate that the suggested method surpasses state-of-the-art methods for data augmentation.
引用
收藏
页数:5
相关论文
共 24 条
[1]  
[Anonymous], 2014, 22 ACM INT C MULT AC
[2]  
[Anonymous], 2016, ICLR
[3]  
[Anonymous], 2017, DATASET AUGMENTATION
[4]  
[Anonymous], 2013, P 30 INT C MACH LEAR
[5]  
Bello J. P., 2012, PROC ISMIR, P403
[6]  
Boddapati V., 2017, INT C KNOWL BAS INT
[7]  
Bousmalis K., 2016, IEEE C COMP VIS PATT
[8]   Environmental Sound Recognition With Time-Frequency Audio Features [J].
Chu, Selina ;
Narayanan, Shrikanth ;
Kuo, C. -C. Jay .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06) :1142-1158
[9]   Audio-visual event recognition in surveillance video sequences [J].
Cristani, Marco ;
Bicego, Manuele ;
Murino, Vittorio .
IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) :257-267
[10]  
Donahue C., SYNTHESIZING AUDIO G