Target Speaker Extraction with Attention Enhancement and Gated Fusion Mechanism

被引：1

作者：

Wang Sijie ^{[1
,2
]}

Hamdulla, Askar ^{[1
,2
]}

Ablimit, Mijit ^{[1
,2
]}

机构：

[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi, Peoples R China

[2] Key Lab Signal Detect & Proc, Urumqi, Peoples R China

来源：

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年

关键词：

target speaker extraction; attention; gated fusion; multi-task learning; NETWORK;

D O I：

10.1109/APSIPAASC58517.2023.10317106

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The objective of a target speaker extraction system is to extract the speech of the target speaker from a mixture of multiple speakers and noises using a certain amount of additional information of the target speaker. In this paper, we investigate the improvements of the baseline system by incorporating the light-weight CBAM module in the target extractor, and the gated fusion module (GFM) in the fusion layer. The CBAM introduces attention enhancement to baseline model with no significant increase in the number of parameters and complexity, and the previous concatenation-based fusion method used for speaker embedding and input mixture (or intermediate output) is replaced by GFM, enabling the model to better leverage the supplementary information provided by speaker embedding. Experimental results on datasets built from WSJ0-2mix and WHAM! demonstrate that both the CBAM module and the light-weight GFM module individually improve the model performance, and the GFM module shows better improvement on WHAM!. However, the combination of these two modules only exhibits mutually beneficial effects on the clean dataset WSJ0-2mix, while the performance of the combined module on the noisy dataset WHAM! is inferior to that of using the GFM module alone.

引用

页码：1995 / 2001

页数：7

共 50 条

[31] MEEAFusion: Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion
Xie, Yingjiang
Fei, Zhennan
Deng, Da
Meng, Lingshuai
Niu, Fu
Sun, Jinggong
SENSORS, 2024, 24 (17)
[32] Infrared and Visible Image Fusion with Significant Target Enhancement
Huo, Xing
Deng, Yinping
Shao, Kun
ENTROPY, 2022, 24 (11)
[33] RETRACTED: A Multichannel Model for Microbial Key Event Extraction Based on Feature Fusion and Attention Mechanism (Retracted Article)
Li, Peng
Wang, Qian
SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
[34] A multi-scale feature extraction and fusion method for bearing fault diagnosis based on hybrid attention mechanism
Meng, Huan
Zhang, Jiakai
Zhao, Jingbo
Wang, Daichao
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (SUPPL 1) : 31 - 41
[35] Attend and Rectify: A Gated Attention Mechanism for Fine-Grained Recovery
Rodriguez, Pau
Gonfaus, Josep M.
Cucurull, Guillem
Xavier Roca, F.
Gonzalez, Jordi
COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 357 - 372
[36] FUSION TARGET ATTENTION MASK GENERATION NETWORK FOR VIDEO SEGMENTATION
Li, Yunyi
Chen, Fangping
Yang, Fan
Li, Yuan
Jia, Huizhu
Xie, Xiaodong
2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2276 - 2280
[37] IMPROVING SPEAKER DISCRIMINATION OF TARGET SPEECH EXTRACTION WITH TIME-DOMAIN SPEAKERBEAM
Delcroix, Marc
Ochiai, Tsubasa
Zmolikova, Katerina
Kinoshita, Keisuke
Tawara, Naohiro
Nakatani, Tomohiro
Araki, Shoko
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 691 - 695
[38] Underwater target detection with an attention mechanism and improved scale
Xiangyu Wei
Long Yu
Shengwei Tian
Pengcheng Feng
Xin Ning
Multimedia Tools and Applications, 2021, 80 : 33747 - 33761
[39] ATTENTION-BASED SCALING ADAPTATION FOR TARGET SPEECH EXTRACTION
Han, Jiangyu
Rao, Wei
Long, Yanhua
Liang, Jiaen
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 658 - 662
[40] Underwater target detection with an attention mechanism and improved scale
Wei, Xiangyu
Yu, Long
Tian, Shengwei
Feng, Pengcheng
Ning, Xin
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (25) : 33747 - 33761

← 1 2 3 4 5 →