Hierarchical classification for acoustic scenes using deep learning

被引:3
作者
Ding, Biyun [1 ]
Zhang, Tao [1 ]
Liu, Ganjun [1 ]
Wang, Chao [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
Acoustic scene classification; Convolutional neural network; Data augmentation; Hierarchical classification; Late fusion;
D O I
10.1016/j.apacoust.2023.109594
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Acoustic Scene Classification (ASC) aims to obtain the sound environment by analyzing audio signals. Due to the low complexity and acquisition cost of audio signals, ASC has enormous potential in various applications, such as audio-based surveillance, smart cities/homes, and robotics. Recently, various methods have been proposed for ASC and achieved good performance. However, when they are used to address complex ASC problems, most of them suffer from the low-performance problem. In this paper, we propose to use hierarchical classification methods to replace the conventional flat approach in ASC applications, which utilizes the class hierarchy to optimize classification performance. In particular, we investigate the ASC problem under the framework of hierarchical classification. Firstly, to improve classification performance, three hierarchical classification methods introducing the class hierarchy of acoustic scenes are proposed for ASC. Moreover, to fully utilize the class hierarchy, a hybrid hierarchical classification method, and an optimal late fusion-based hierarchical method are proposed, which are based on the flexibility and simplification of hierarchical classification. The experiments demonstrate the efficacy of hierarchical ASC systems for performance improvement, and the best system achieves an accuracy of 78.86% on the DCASE 2020 Task1A dataset, resulting in accuracy gains of 24.76% and 8.52% absolute over the DCASE 2020 Task 1A baseline and the conventional non-hierarchical method, respectively.
引用
收藏
页数:15
相关论文
共 54 条
[1]   A Review of Deep Learning Based Methods for Acoustic Scene Classification [J].
Abesser, Jakob .
APPLIED SCIENCES-BASEL, 2020, 10 (06)
[2]  
Abidin S, 2018, IEEE INT CON MULTI
[3]  
[Anonymous], 2016, Tech Rep, DCASE2016 Challenge
[4]   Frequency-based CNN and attention module for acoustic scene classification [J].
Aryal, Nisan ;
Lee, Sang-Woong .
APPLIED ACOUSTICS, 2023, 210
[5]   CNN-RNN and Data Augmentation Using Deep Convolutional Generative Adversarial Network for Environmental Sound Classification [J].
Bahmei, Behnaz ;
Birmingham, Elina ;
Arzanpour, Siamak .
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 :682-686
[6]   A Squeeze-and-Excitation and Transformer-Based Cross-Task Model for Environmental Sound Recognition [J].
Bai, Jisheng ;
Chen, Jianfeng ;
Wang, Mou ;
Ayub, Muhammad Saad ;
Yan, Qingli .
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (03) :1501-1513
[7]  
Bai X, 2020, INT CONF ACOUST SPEE, P656, DOI [10.1109/icassp40776.2020.9053519, 10.1109/ICASSP40776.2020.9053519]
[8]   Exploiting hierarchy in environmental sound classification [J].
Bajzik, Mob ;
Jarina, Roman .
2022 32ND INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2022, :27-30
[9]   Acoustic Scene Classification [J].
Barchiesi, Daniele ;
Giannoulis, Dimitrios ;
Stowell, Dan ;
Plumbley, Mark D. .
IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (03) :16-34
[10]   A Multiple-Instance Densely-Connected ConvNet for Aerial Scene Classification [J].
Bi, Qi ;
Qin, Kun ;
Li, Zhili ;
Zhang, Han ;
Xu, Kai ;
Xia, Gui-Song .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :4911-4926