Target Speaker Extraction with Attention Enhancement and Gated Fusion Mechanism

被引：1

作者：

Wang Sijie ^{[1
,2
]}

Hamdulla, Askar ^{[1
,2
]}

Ablimit, Mijit ^{[1
,2
]}

机构：

[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi, Peoples R China

[2] Key Lab Signal Detect & Proc, Urumqi, Peoples R China

来源：

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年

关键词：

target speaker extraction; attention; gated fusion; multi-task learning; NETWORK;

D O I：

10.1109/APSIPAASC58517.2023.10317106

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The objective of a target speaker extraction system is to extract the speech of the target speaker from a mixture of multiple speakers and noises using a certain amount of additional information of the target speaker. In this paper, we investigate the improvements of the baseline system by incorporating the light-weight CBAM module in the target extractor, and the gated fusion module (GFM) in the fusion layer. The CBAM introduces attention enhancement to baseline model with no significant increase in the number of parameters and complexity, and the previous concatenation-based fusion method used for speaker embedding and input mixture (or intermediate output) is replaced by GFM, enabling the model to better leverage the supplementary information provided by speaker embedding. Experimental results on datasets built from WSJ0-2mix and WHAM! demonstrate that both the CBAM module and the light-weight GFM module individually improve the model performance, and the GFM module shows better improvement on WHAM!. However, the combination of these two modules only exhibits mutually beneficial effects on the clean dataset WSJ0-2mix, while the performance of the combined module on the noisy dataset WHAM! is inferior to that of using the GFM module alone.

引用

页码：1995 / 2001

页数：7

共 50 条

[1] MULTIMODAL ATTENTION FUSION FOR TARGET SPEAKER EXTRACTION
Sato, Hiroshi
Ochiai, Tsubasa
Kinoshita, Keisuke
Delcroix, Marc
Nakatani, Tomohiro
Araki, Shoko
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 778 - 784
[2] Hierarchic Temporal Convolutional Network with Attention Fusion for Target Speaker Extraction
Chen, Zihao
Qiu, Wenbo
Xu, Haitao
Hu, Ying
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 827 - 832
[3] Contrastive Learning for Target Speaker Extraction With Attention-Based Fusion
Li, Xiao
Liu, Ruirui
Huang, Huichou
Wu, Qingyao
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 178 - 188
[4] Speaker extraction network with attention mechanism for speech dialogue system
Hao, Yun
Wu, Jiaju
Huang, Xiangkang
Zhang, Zijia
Liu, Fei
Wu, Qingyao
SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2022, 16 (02) : 111 - 119
[5] Speaker extraction network with attention mechanism for speech dialogue system
Yun Hao
Jiaju Wu
Xiangkang Huang
Zijia Zhang
Fei Liu
Qingyao Wu
Service Oriented Computing and Applications, 2022, 16 : 111 - 119
[6] Target Speaker Extraction Using Attention-Enhanced Temporal Convolutional Network
Wang, Jian-Hong
Lai, Yen-Ting
Tai, Tzu-Chiang
Le, Phuong Thi
Pham, Tuan
Wang, Ze-Yu
Li, Yung-Hui
Wang, Jia-Ching
Chang, Pao-Chi
Botzheim, Janos
ELECTRONICS, 2024, 13 (02)
[7] Gated Cross-Attention for Universal Speaker Extraction: Toward Real-World Applications
Zhang, Yiru
Liu, Bijing
Yang, Yong
Yang, Qun
ELECTRONICS, 2024, 13 (11)
[8] SPEAKER-CONDITIONING SINGLE-CHANNEL TARGET SPEAKER EXTRACTION USING CONFORMER-BASED ARCHITECTURES
Sinha, Ragini
Tammen, Marvin
Rollwage, Christian
Doclo, Simon
2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
[9] Speaker Extraction with Detection of Presence and Absence of Target Speakers
Zhang, Ke
Borsdorf, Marvin
Pan, Zexu
Li, Haizhou
Wei, Yangjie
Wang, Yi
INTERSPEECH 2023, 2023, : 3714 - 3718
[10] Target Speaker Extraction for Multi-Talker Speaker Verification
Rao, Wei
Xu, Chenglin
Chng, Eng Siong
Li, Haizhou
INTERSPEECH 2019, 2019, : 1273 - 1277

← 1 2 3 4 5 →