Dynamic Convolution With Global-Local Information for Session-Invariant Speaker Representation Learning

被引:6
|
作者
Gu, Bin [1 ]
Guo, Wu [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230036, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolution; Kernel; NIST; Training; Training data; Time-frequency analysis; Neural networks; Speaker verification; dynamic convolution; mismatch problem; acoustic variability; RECOGNITION;
D O I
10.1109/LSP.2021.3136141
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Various mismatchedconditions result in performance degradation of the speaker verification (SV) systems. To address this issue, we extract robust speaker representations by devising a global-local information-based dynamic convolution neural network. In the proposed method, both global and local information of the input features are exploited to dynamically modify the convolution kernel values. This increases the model capability of capturing speaker characteristics by compensating both the inter- and intra-session variabilities. Extensive experiments on four publicly available SV datasets show significant and consistent improvements over the conventional approaches. The effectiveness of the proposed method is further investigated using ablation studies and visualizations.
引用
收藏
页码:404 / 408
页数:5
相关论文
共 50 条
  • [21] Contrastive Learning of Global-Local Video Representations
    Ma, Shuang
    Zeng, Zhaoyang
    McDuff, Daniel
    Song, Yale
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [22] Enhanced Local and Global Learning for Rotation-Invariant Point Cloud Representation
    Gu, Ruibin
    Wu, Qiuxia
    Li, Yuqiong
    Kang, Wenxiong
    Ng, Wing W. Y.
    Wang, Zhiyong
    IEEE MULTIMEDIA, 2022, 29 (04) : 24 - 37
  • [23] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
    Rao, Yongming
    Lu, Jiwen
    Zhou, Jie
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5375 - 5384
  • [24] Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition
    Gao, Yuefang
    Xie, Yuhao
    Hu, Zeke Zexi
    Chen, Tianshui
    Lin, Liang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6676 - 6688
  • [25] Global-local neighborhood based network representation for citation recommendation
    Xiaoyan Cai
    Nanxin Wang
    Libin Yang
    Xin Mei
    Applied Intelligence, 2022, 52 : 10098 - 10115
  • [26] Global-local neighborhood based network representation for citation recommendation
    Cai, Xiaoyan
    Wang, Nanxin
    Yang, Libin
    Mei, Xin
    APPLIED INTELLIGENCE, 2022, 52 (09) : 10098 - 10115
  • [27] Learning and teaching sustainable development in global-local contexts
    Norden, Birgitta
    Anderberg, Elsie
    Sonesson, Kerstin
    Reid, Alan
    ENVIRONMENTAL EDUCATION RESEARCH, 2018, 24 (05) : 772 - 773
  • [28] EIGAT: Incorporating global information in local attention for knowledge representation learning
    Zhao, Yu
    Feng, Huali
    Zhou, Han
    Yang, Yanruo
    Chen, Xingyan
    Xie, Ruobing
    Zhuang, Fuzhen
    Li, Qing
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [29] A GLOBAL-LOCAL CONTRASTIVE LEARNING FRAMEWORK FOR VIDEO CAPTIONING
    Huang, Qunyue
    Fang, Bin
    Ai, Xi
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2410 - 2414
  • [30] GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition
    Huang, Shih-Cheng
    Shen, Liyue
    Lungren, Matthew P.
    Yeung, Serena
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3922 - 3931