SALADNET: SELF-ATTENTIVE MULTISOURCE LOCALIZATION IN THE AMBISONICS DOMAIN

被引:8
|
作者
Grumiaux, Pierre-Amaury [1 ]
Kitic, Srdan [1 ]
Srivastava, Prerak [2 ]
Girin, Laurent [3 ]
Guerin, Alexandre [1 ]
机构
[1] Orange Labs, Cesson Sevigne, France
[2] Univ Lorraine, INRIA, Nancy, France
[3] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France
来源
2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2021年
关键词
Sound source localization; neural networks; self-attention; Ambisonics; parallel computing;
D O I
10.1109/WASPAA52581.2021.9632737
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing the recurrent layers by self-attention encoders, inherited from the Transformer architecture. We evaluate these models on synthetic and real-world data, with up to 3 simultaneous speakers. The obtained results indicate that the majority of the proposed architectures either perform on par, or outperform the CRNN baseline, especially in the multisource scenario. Moreover, by avoiding the recurrent layers, the proposed models lend themselves to parallel computing, which is shown to produce considerable savings in execution time.
引用
收藏
页码:336 / 340
页数:5
相关论文
共 44 条
  • [1] SAED: self-attentive energy disaggregation
    Virtsionis-Gkalinikis, Nikolaos
    Nalmpantis, Christoforos
    Vrakas, Dimitris
    MACHINE LEARNING, 2023, 112 (11) : 4081 - 4100
  • [2] SAED: self-attentive energy disaggregation
    Nikolaos Virtsionis-Gkalinikis
    Christoforos Nalmpantis
    Dimitris Vrakas
    Machine Learning, 2023, 112 : 4081 - 4100
  • [3] A SELF-ATTENTIVE EMOTION RECOGNITION NETWORK
    Partaourides, Harris
    Papadamou, Kostantinos
    Kourtellis, Nicolas
    Leontiades, Ilias
    Chatzis, Sotirios
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7199 - 7203
  • [4] Self-Attentive Spatial Adaptive Normalization for Cross-Modality Domain Adaptation
    Tomar, Devavrat
    Lortkipanidze, Manana
    Vray, Guillaume
    Bozorgtabar, Behzad
    Thiran, Jean-Philippe
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (10) : 2926 - 2938
  • [5] SAIN: Self-Attentive Integration Network for Recommendation
    Yun, Seoungjun
    Kim, Raehyun
    Ko, Miyoung
    Kang, Jaewoo
    PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 1205 - 1208
  • [6] Self-Attentive Similarity Measurement Strategies in Speaker Diarization
    Lin, Qingjian
    Hou, Yu
    Li, Ming
    INTERSPEECH 2020, 2020, : 284 - 288
  • [7] A self-attentive model for tracing knowledge and engagement in parallel
    Jiang, Hua
    Xiao, Bing
    Luo, Yintao
    Ma, Junliang
    PATTERN RECOGNITION LETTERS, 2023, 165 : 25 - 32
  • [8] SELF-ATTENTIVE SENTIMENTAL SENTENCE EMBEDDING FOR SENTIMENT ANALYSIS
    Lin, Sheng-Chieh
    Su, Wen-Yuh
    Chien, Po-Chuan
    Tsai, Ming-Feng
    Wang, Chuan-Ju
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1678 - 1682
  • [9] Self-Attentive Attributed Network Embedding Through Adversarial Learning
    Yu, Wenchao
    Cheng, Wei
    Aggarwal, Charu
    Zong, Bo
    Chen, Haifeng
    Wang, Wei
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 758 - 767
  • [10] SELF-ATTENTIVE NETWORKS FOR ONE-SHOT IMAGE RECOGNITION
    Fang, Pin
    Wang, Yisen
    Luo, Yuan
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 934 - 939