SALADNET: SELF-ATTENTIVE MULTISOURCE LOCALIZATION IN THE AMBISONICS DOMAIN

被引:8
作者
Grumiaux, Pierre-Amaury [1 ]
Kitic, Srdan [1 ]
Srivastava, Prerak [2 ]
Girin, Laurent [3 ]
Guerin, Alexandre [1 ]
机构
[1] Orange Labs, Cesson Sevigne, France
[2] Univ Lorraine, INRIA, Nancy, France
[3] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France
来源
2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2021年
关键词
Sound source localization; neural networks; self-attention; Ambisonics; parallel computing;
D O I
10.1109/WASPAA52581.2021.9632737
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing the recurrent layers by self-attention encoders, inherited from the Transformer architecture. We evaluate these models on synthetic and real-world data, with up to 3 simultaneous speakers. The obtained results indicate that the majority of the proposed architectures either perform on par, or outperform the CRNN baseline, especially in the multisource scenario. Moreover, by avoiding the recurrent layers, the proposed models lend themselves to parallel computing, which is shown to produce considerable savings in execution time.
引用
收藏
页码:336 / 340
页数:5
相关论文
共 44 条
[21]   Bayesian Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification [J].
Zhu, Yingke ;
Mak, Brian .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 :1000-1012
[22]   SELF-ATTENTIVE VAD: CONTEXT-AWARE DETECTION OF VOICE FROM NOISE [J].
Jo, Yong Rae ;
Moon, Young Ki ;
Cho, Won Ik ;
Jo, Geun Sik .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :6808-6812
[23]   Self-attentive deep learning method for online traffic classification and its interpretability [J].
Xie, Guorui ;
Li, Qing ;
Jiang, Yong .
COMPUTER NETWORKS, 2021, 196
[24]   AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks [J].
Song, Weiping ;
Shi, Chence ;
Xiao, Zhiping ;
Duan, Zhijian ;
Xu, Yewen ;
Zhang, Ming ;
Tang, Jian .
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, :1161-1170
[25]   Learning Relevant Molecular Representations via Self-Attentive Graph Neural Networks [J].
Kikuchi, Shoma ;
Takigawa, Ichigaku ;
Oyama, Satoshi ;
Kurihara, Masahiro .
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, :5364-5369
[26]   A hierarchical self-attentive neural extractive summarizer via reinforcement learning (HSASRL) [J].
Farida Mohsen ;
Jiayang Wang ;
Kamal Al-Sabahi .
Applied Intelligence, 2020, 50 :2633-2646
[27]   SPEAKER DIARISATION USING 2D SELF-ATTENTIVE COMBINATION OF EMBEDDINGS [J].
Sun, G. ;
Zhang, C. ;
Woodland, P. C. .
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, :5801-5805
[28]   ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition [J].
Pan, Jing ;
Shapiro, Joshua ;
Wohlwend, Jeremy ;
Han, Kyu J. ;
Lei, Tao ;
Ma, Tao .
INTERSPEECH 2020, 2020, :16-20
[29]   SAGRNN: Self-Attentive Gated RNN For Binaural Speaker Separation With Interaural Cue Preservation [J].
Tan, Ke ;
Xu, Buye ;
Kumar, Anurag ;
Nachmani, Eliya ;
Adi, Yossi .
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 (28) :26-30
[30]   A Robust Self-Attentive Capsule Network for Fault Diagnosis of Series-Compensated Transmission Line [J].
Fahim, Shahriar Rahman ;
Sarker, Subrata K. ;
Muyeen, S. M. ;
Sheikh, Md. Rafiqul Islam ;
Das, Sajal K. ;
Simoes, Marcelo .
IEEE TRANSACTIONS ON POWER DELIVERY, 2021, 36 (06) :3846-3857