Semi-supervised Audio Classification with Consistency-Based Regularization

被引:10
作者
Lu, Kangkang [1 ]
Foo, Chuan-Sheng [1 ]
Teh, Kah Kuan [1 ]
Huy Dat Tran [1 ]
Chandrasekhar, Vijay Ramaseshan [1 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore, Singapore
来源
INTERSPEECH 2019 | 2019年
关键词
audio classification; semi-supervised learning; data interpolation; data augmentation;
D O I
10.21437/Interspeech.2019-1231
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Consistency-based semi-supervised learning methods such as the Mean Teacher method are state-of-the-art on image datasets, but have yet to be applied to audio data. Such methods encourage model predictions to be consistent on perturbed input data. In this paper, we incorporate audio-specific perturbations into the Mean Teacher algorithm and demonstrate the effectiveness of the resulting method on audio classification tasks. Specifically, we perturb audio inputs by mixing in other environmental audio clips, and leverage other training examples as sources of noise. Experiments on the Google Speech Command Dataset and UrbanSound8K Dataset show that the method can achieve comparable performance to a purely supervised approach while using only a fraction of the labels.
引用
收藏
页码:3654 / 3658
页数:5
相关论文
共 13 条
[1]  
[Anonymous], 2013, 21 EUR SIGN PROC C E
[2]  
[Anonymous], 2012, P 50 ANN ASS COMP
[3]  
Gemmeke J. F., 2017, ICASSP IEEE INT C AC
[4]  
Hakkani-Tur D., 2004, UNSUPERVISED ACTIVE
[5]  
Han W., 2016, PLOS ONE
[6]  
Laine S., 2017, ICLR
[7]  
Pecanha C, 2017, PROCEEDINGS OF THE 11TH INTERNATIONAL PIPELINE CONFERENCE, 2016, VOL 2
[8]  
Rasmus A, 2015, ADV NEUR IN, V28
[9]   A Dataset and Taxonomy for Urban Sound Research [J].
Salamon, Justin ;
Jacoby, Christopher ;
Bello, Juan Pablo .
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, :1041-1044
[10]  
Tarvainen A, 2017, ADV NEUR IN, V30