LDA-based data augmentation algorithm for acoustic scene classification

被引:15
作者
Leng, Yan [1 ,2 ]
Zhao, Weiwei [1 ,2 ]
Lin, Chan [1 ,2 ]
Sun, Chengli [3 ]
Wang, Rongyan [4 ]
Yuan, Qi [1 ,2 ]
Li, Dengwang [1 ,2 ]
机构
[1] Shandong Normal Univ, Shandong Key Lab Med Phys & Image Proc, Sch Phys & Elect, Jinan 250358, Shandong, Peoples R China
[2] Shandong Normal Univ, Shandong Prov Engn & Tech Ctr Light Manipulat, Sch Phys & Elect, Jinan 250358, Shandong, Peoples R China
[3] Nanchang Hangkong Univ, Sch Informat, Nanchang 330063, Jiangxi, Peoples R China
[4] Dezhou Univ, Sch Informat Management, Dezhou 253023, Peoples R China
基金
中国国家自然科学基金;
关键词
Acoustic scene classification; Topic model; LDA; Key audio event; Non-key audio event; NEURAL-NETWORKS;
D O I
10.1016/j.knosys.2020.105600
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural network needs large amount of data for training, to obtain more data, many simple data augmentation algorithms have been proposed. In this paper, we propose a LDA-based data augmentation algorithm to extend the training set. The proposed LDA-based data augmentation algorithm uses the topic model LDA to detect the key audio words in the recordings, and further to detect the key audio events and non-key audio events for each recording; with the detected keyaudio-event segments, for each acoustic scene class, the probability distribution of key-audio-event's occurrence numbers, the probability distribution of key-audio-event's locations under each occurrence number and the probability distribution of key-audio-event's durations under each occurrence number is counted, and then the new recordings are generated according to these probability distributions. Experiments are done on the public TUT acoustic scenes 2016 dataset, and the experimental results show that compared with the other simple data augmentation algorithms, the proposed LDA-based data augmentation algorithm is more stable and effective, it can get better generalization ability for different kinds of neural network on different datasets. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 23 条
[1]  
Amiriparian S., 2017, P 2 DET CLASS AC SCE, P17
[2]  
[Anonymous], 2015, P 14 PYTH SCI C SCIP
[3]  
[Anonymous], 2017, DCASE WORKSH MUN GER
[4]  
[Anonymous], [No title captured]
[5]  
Bae S.H., 2016, DCASE, P11
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]  
Darling WM, 2011, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, P642
[8]  
Eyben F., 2013, P 21 ACM INT C MULT, P835, DOI 10.1145/2502081.2502224
[9]  
Ioffe S, 2015, PR MACH LEARN RES, V37, P448
[10]  
Jakob A., 2017, Proc. Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, P7