Dynamic Convolution With Global-Local Information for Session-Invariant Speaker Representation Learning

被引:6
作者
Gu, Bin [1 ]
Guo, Wu [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230036, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolution; Kernel; NIST; Training; Training data; Time-frequency analysis; Neural networks; Speaker verification; dynamic convolution; mismatch problem; acoustic variability; RECOGNITION;
D O I
10.1109/LSP.2021.3136141
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Various mismatchedconditions result in performance degradation of the speaker verification (SV) systems. To address this issue, we extract robust speaker representations by devising a global-local information-based dynamic convolution neural network. In the proposed method, both global and local information of the input features are exploited to dynamically modify the convolution kernel values. This increases the model capability of capturing speaker characteristics by compensating both the inter- and intra-session variabilities. Extensive experiments on four publicly available SV datasets show significant and consistent improvements over the conventional approaches. The effectiveness of the proposed method is further investigated using ablation studies and visualizations.
引用
收藏
页码:404 / 408
页数:5
相关论文
共 35 条
  • [31] Learning facial expression-aware global-to-local representation for robust action unit detection
    An, Rudong
    Jin, Aobo
    Chen, Wei
    Zhang, Wei
    Zeng, Hao
    Deng, Zhigang
    Ding, Yu
    APPLIED INTELLIGENCE, 2024, 54 (02) : 1405 - 1425
  • [32] Learning Lightweight Dynamic Kernels With Attention Inside via Local-Global Context Fusion
    Tian, Yonglin
    Shen, Yu
    Wang, Xiao
    Wang, Jiangong
    Wang, Kunfeng
    Ding, Weiping
    Wang, Zilei
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 8984 - 8998
  • [33] Learning from normalized local and global discriminative information for semi-supervised regression and dimensionality reduction
    Zhao, Mingbo
    Chow, Tommy W. S.
    Wu, Zhou
    Zhang, Zhao
    Li, Bing
    INFORMATION SCIENCES, 2015, 324 : 286 - 309
  • [34] RailSeg: Learning Local-Global Feature Aggregation With Contextual Information for Railway Point Cloud Semantic Segmentation
    Jiang, Tengping
    Yang, Bisheng
    Wang, Yongjun
    Dai, Lei
    Qiu, Bo
    Liu, Shan
    Li, Shiwei
    Zhang, Qinyu
    Jin, Xin
    Zeng, Wenjun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [35] A novel medical image segmentation approach by using multi-branch segmentation network based on local and global information synchronous learning
    Jin, Shangzhu
    Yu, Sheng
    Peng, Jun
    Wang, Hongyi
    Zhao, Yan
    SCIENTIFIC REPORTS, 2023, 13 (01)