Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling

被引:17
作者
Chen, Hangting [1 ,2 ]
Zhang, Pengyuan [1 ,2 ]
Bai, Haichuan [1 ,2 ]
Yuan, Qingsheng [3 ]
Bao, Xiuguo [3 ]
Yan, Yonghong [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Natl Comp Network Emergency Response Tech Team Co, Beijing 100029, Peoples R China
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
基金
中国国家自然科学基金;
关键词
Acoustic scene classification; Scalogram; Convolutional neural network; DCASE2016;
D O I
10.21437/Interspeech.2018-1524
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has improved the performance of acoustic scene classification recently. However, learning is usually based on short-time Fourier transform and hand-tailored filters. Learning directly from raw signals has remained a big challenge. In this paper, we proposed an approach to learning audio scene patterns from scalogram, which is extracted from raw signal with simple wavelet transforms. The experiments were conducted on DCASE2016 dataset. We compared scalogram with classical Mel energy, which showed that multi-scale feature led to an obvious accuracy increase. The convolutional neural network integrated with maximum-average downsampled scalogram achieved an accuracy of 90.5% in the evaluation step in DCASE2016.
引用
收藏
页码:3304 / 3308
页数:5
相关论文
共 20 条
[1]  
Amiriparian S., 2017, TECH REP
[2]   Deep Scattering Spectrum [J].
Anden, Joakim ;
Mallat, Stephane .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2014, 62 (16) :4114-4128
[3]  
[Anonymous], TECH REP
[4]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[5]  
[Anonymous], 2012, INT C ART INT STAT
[6]  
[Anonymous], 2010, WAVELET TOUR SIGNAL
[7]  
Bisot V., 2016, DCASE2016 CHALLENGE
[8]  
Bruna J., 2013, INVARIANT SCATTERING
[9]  
Eghbal-Zadeh H., 2016, DCASE2016 CHALLENGE
[10]  
Elizalde B., 2016, DCASE2016 CHALLENGE