Dynamic Convolution With Global-Local Information for Session-Invariant Speaker Representation Learning

被引：6

作者：

Gu, Bin ^{[1
]}

Guo, Wu ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230036, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2022年 / 29卷

基金：

中国国家自然科学基金;

关键词：

Convolution; Kernel; NIST; Training; Training data; Time-frequency analysis; Neural networks; Speaker verification; dynamic convolution; mismatch problem; acoustic variability; RECOGNITION;

D O I：

10.1109/LSP.2021.3136141

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Various mismatchedconditions result in performance degradation of the speaker verification (SV) systems. To address this issue, we extract robust speaker representations by devising a global-local information-based dynamic convolution neural network. In the proposed method, both global and local information of the input features are exploited to dynamically modify the convolution kernel values. This increases the model capability of capturing speaker characteristics by compensating both the inter- and intra-session variabilities. Extensive experiments on four publicly available SV datasets show significant and consistent improvements over the conventional approaches. The effectiveness of the proposed method is further investigated using ablation studies and visualizations.

引用

页码：404 / 408

页数：5

共 50 条

[21] Contrastive Learning of Global-Local Video Representations
Ma, Shuang
Zeng, Zhaoyang
McDuff, Daniel
Song, Yale
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[22] Enhanced Local and Global Learning for Rotation-Invariant Point Cloud Representation
Gu, Ruibin
Wu, Qiuxia
Li, Yuqiong
Kang, Wenxiong
Ng, Wing W. Y.
Wang, Zhiyong
IEEE MULTIMEDIA, 2022, 29 (04) : 24 - 37
[23] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
Rao, Yongming
Lu, Jiwen
Zhou, Jie
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5375 - 5384
[24] Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition
Gao, Yuefang
Xie, Yuhao
Hu, Zeke Zexi
Chen, Tianshui
Lin, Liang
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6676 - 6688
[25] Global-local neighborhood based network representation for citation recommendation
Xiaoyan Cai
Nanxin Wang
Libin Yang
Xin Mei
Applied Intelligence, 2022, 52 : 10098 - 10115
[26] Global-local neighborhood based network representation for citation recommendation
Cai, Xiaoyan
Wang, Nanxin
Yang, Libin
Mei, Xin
APPLIED INTELLIGENCE, 2022, 52 (09) : 10098 - 10115
[27] Learning and teaching sustainable development in global-local contexts
Norden, Birgitta
Anderberg, Elsie
Sonesson, Kerstin
Reid, Alan
ENVIRONMENTAL EDUCATION RESEARCH, 2018, 24 (05) : 772 - 773
[28] EIGAT: Incorporating global information in local attention for knowledge representation learning
Zhao, Yu
Feng, Huali
Zhou, Han
Yang, Yanruo
Chen, Xingyan
Xie, Ruobing
Zhuang, Fuzhen
Li, Qing
KNOWLEDGE-BASED SYSTEMS, 2022, 237
[29] A GLOBAL-LOCAL CONTRASTIVE LEARNING FRAMEWORK FOR VIDEO CAPTIONING
Huang, Qunyue
Fang, Bin
Ai, Xi
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2410 - 2414
[30] GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition
Huang, Shih-Cheng
Shen, Liyue
Lungren, Matthew P.
Yeung, Serena
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3922 - 3931

← 1 2 3 4 5 →