Acoustic Event Detection with Classifier Chains

被引:4
作者
Komatsu, Tatsuya [1 ]
Watanabe, Shinji [2 ]
Miyazaki, Koichi [3 ]
Hayashi, Tomoki [3 ,4 ]
机构
[1] LINE Corp, Tokyo, Japan
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Nagoya Univ, Nagoya, Aichi, Japan
[4] Human Dataware Lab Co Ltd, Nagoya, Aichi, Japan
来源
INTERSPEECH 2021 | 2021年
关键词
acoustic event detection; multi-label classification; chain rule; classifier chains;
D O I
10.21437/Interspeech.2021-2218
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper proposes acoustic event detection (AED) with classifier chains, a new classifier based on the probabilistic chain rule. The proposed AED with classifier chains consists of a gated recurrent unit and performs iterative binary detection of each event one by one. In each iteration, the event's activity is estimated and used to condition the next output based on the probabilistic chain rule to form classifier chains. Therefore, the proposed method can handle the interdependence among events upon classification, while the conventional AED methods with multiple binary classifiers with a linear layer and sigmoid function have placed an assumption of conditional independence. In the experiments with a real-recording dataset, the proposed method demonstrates its superior AED performance to a relative 14.80% improvement compared to a convolutional recurrent neural network baseline system with the multiple binary classifiers.
引用
收藏
页码:601 / 605
页数:5
相关论文
共 31 条
[1]  
Adavanne S, 2017, INT CONF ACOUST SPEE, P771, DOI 10.1109/ICASSP.2017.7952260
[2]  
[Anonymous], 2016, P WORKSH DET CLASS A
[3]   Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection [J].
Cakir, Emre ;
Parascandolo, Giambattista ;
Heittola, Toni ;
Huttunen, Heikki ;
Virtanen, Tuomas .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) :1291-1303
[4]   Taming Pretrained Transformers for Extreme Multi-label Text Classification [J].
Chang, Wei-Cheng ;
Yu, Hsiang-Fu ;
Zhong, Kai ;
Yang, Yiming ;
Dhillon, Inderjit S. .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :3163-3171
[5]  
Clavel C, 2005, 2005 IEEE International Conference on Multimedia and Expo (ICME), Vols 1 and 2, P1307
[6]  
Dembczynski K., 2010, P 27 INT C MACH LEAR, P279
[7]  
Dikmen O, 2013, IEEE WORK APPL SIG
[8]  
Fujita Y, 2020, Arxiv, DOI arXiv:2006.01796
[9]  
Gemmeke JF, 2017, INT CONF ACOUST SPEE, P776, DOI 10.1109/ICASSP.2017.7952261
[10]   Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification [J].
Gong, Jibing ;
Teng, Zhiyong ;
Teng, Qi ;
Zhang, Hekai ;
Du, Linfeng ;
Chen, Shuai ;
Bhuiyan, Md Zakirul Alam ;
Li, Jianhua ;
Liu, Mingsheng ;
Ma, Hongyuan .
IEEE ACCESS, 2020, 8 :30885-30896