Dynamic Convolution With Global-Local Information for Session-Invariant Speaker Representation Learning

被引：6

作者：

Gu, Bin ^{[1
]}

Guo, Wu ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230036, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2022年 / 29卷

基金：

中国国家自然科学基金;

关键词：

Convolution; Kernel; NIST; Training; Training data; Time-frequency analysis; Neural networks; Speaker verification; dynamic convolution; mismatch problem; acoustic variability; RECOGNITION;

D O I：

10.1109/LSP.2021.3136141

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Various mismatchedconditions result in performance degradation of the speaker verification (SV) systems. To address this issue, we extract robust speaker representations by devising a global-local information-based dynamic convolution neural network. In the proposed method, both global and local information of the input features are exploited to dynamically modify the convolution kernel values. This increases the model capability of capturing speaker characteristics by compensating both the inter- and intra-session variabilities. Extensive experiments on four publicly available SV datasets show significant and consistent improvements over the conventional approaches. The effectiveness of the proposed method is further investigated using ablation studies and visualizations.

引用

页码：404 / 408

页数：5

共 35 条

[31] Learning facial expression-aware global-to-local representation for robust action unit detection
An, Rudong
Jin, Aobo
Chen, Wei
Zhang, Wei
Zeng, Hao
Deng, Zhigang
Ding, Yu
APPLIED INTELLIGENCE, 2024, 54 (02) : 1405 - 1425
[32] Learning Lightweight Dynamic Kernels With Attention Inside via Local-Global Context Fusion
Tian, Yonglin
Shen, Yu
Wang, Xiao
Wang, Jiangong
Wang, Kunfeng
Ding, Weiping
Wang, Zilei
Wang, Fei-Yue
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 8984 - 8998
[33] Learning from normalized local and global discriminative information for semi-supervised regression and dimensionality reduction
Zhao, Mingbo
Chow, Tommy W. S.
Wu, Zhou
Zhang, Zhao
Li, Bing
INFORMATION SCIENCES, 2015, 324 : 286 - 309
[34] RailSeg: Learning Local-Global Feature Aggregation With Contextual Information for Railway Point Cloud Semantic Segmentation
Jiang, Tengping
Yang, Bisheng
Wang, Yongjun
Dai, Lei
Qiu, Bo
Liu, Shan
Li, Shiwei
Zhang, Qinyu
Jin, Xin
Zeng, Wenjun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[35] A novel medical image segmentation approach by using multi-branch segmentation network based on local and global information synchronous learning
Jin, Shangzhu
Yu, Sheng
Peng, Jun
Wang, Hongyi
Zhao, Yan
SCIENTIFIC REPORTS, 2023, 13 (01)

← 1 2 3 4 →