SALADNET: SELF-ATTENTIVE MULTISOURCE LOCALIZATION IN THE AMBISONICS DOMAIN

被引：8

作者：

Grumiaux, Pierre-Amaury ^{[1
]}

Kitic, Srdan ^{[1
]}

Srivastava, Prerak ^{[2
]}

Girin, Laurent ^{[3
]}

Guerin, Alexandre ^{[1
]}

机构：

[1] Orange Labs, Cesson Sevigne, France

[2] Univ Lorraine, INRIA, Nancy, France

[3] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France

来源：

2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2021年

关键词：

Sound source localization; neural networks; self-attention; Ambisonics; parallel computing;

D O I：

10.1109/WASPAA52581.2021.9632737

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing the recurrent layers by self-attention encoders, inherited from the Transformer architecture. We evaluate these models on synthetic and real-world data, with up to 3 simultaneous speakers. The obtained results indicate that the majority of the proposed architectures either perform on par, or outperform the CRNN baseline, especially in the multisource scenario. Moreover, by avoiding the recurrent layers, the proposed models lend themselves to parallel computing, which is shown to produce considerable savings in execution time.

引用

页码：336 / 340

页数：5

共 44 条

[1] SAED: self-attentive energy disaggregation
Virtsionis-Gkalinikis, Nikolaos
Nalmpantis, Christoforos
Vrakas, Dimitris
MACHINE LEARNING, 2023, 112 (11) : 4081 - 4100
[2] SAED: self-attentive energy disaggregation
Nikolaos Virtsionis-Gkalinikis
Christoforos Nalmpantis
Dimitris Vrakas
Machine Learning, 2023, 112 : 4081 - 4100
[3] A SELF-ATTENTIVE EMOTION RECOGNITION NETWORK
Partaourides, Harris
Papadamou, Kostantinos
Kourtellis, Nicolas
Leontiades, Ilias
Chatzis, Sotirios
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7199 - 7203
[4] Self-Attentive Spatial Adaptive Normalization for Cross-Modality Domain Adaptation
Tomar, Devavrat
Lortkipanidze, Manana
Vray, Guillaume
Bozorgtabar, Behzad
Thiran, Jean-Philippe
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (10) : 2926 - 2938
[5] SAIN: Self-Attentive Integration Network for Recommendation
Yun, Seoungjun
Kim, Raehyun
Ko, Miyoung
Kang, Jaewoo
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 1205 - 1208
[6] Self-Attentive Similarity Measurement Strategies in Speaker Diarization
Lin, Qingjian
Hou, Yu
Li, Ming
INTERSPEECH 2020, 2020, : 284 - 288
[7] A self-attentive model for tracing knowledge and engagement in parallel
Jiang, Hua
Xiao, Bing
Luo, Yintao
Ma, Junliang
PATTERN RECOGNITION LETTERS, 2023, 165 : 25 - 32
[8] SELF-ATTENTIVE SENTIMENTAL SENTENCE EMBEDDING FOR SENTIMENT ANALYSIS
Lin, Sheng-Chieh
Su, Wen-Yuh
Chien, Po-Chuan
Tsai, Ming-Feng
Wang, Chuan-Ju
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1678 - 1682
[9] Self-Attentive Attributed Network Embedding Through Adversarial Learning
Yu, Wenchao
Cheng, Wei
Aggarwal, Charu
Zong, Bo
Chen, Haifeng
Wang, Wei
2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 758 - 767
[10] SELF-ATTENTIVE NETWORKS FOR ONE-SHOT IMAGE RECOGNITION
Fang, Pin
Wang, Yisen
Luo, Yuan
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 934 - 939

← 1 2 3 4 5 →