SPEAKER-AWARE TARGET SPEAKER ENHANCEMENT BY JOINTLY LEARNING WITH SPEAKER EMBEDDING EXTRACTION

被引：0

作者：

Ji, Xuan ^{[1
]}

Yu, Meng ^{[2
]}

Zhang, Chunlei ^{[2
]}

Su, Dan ^{[1
]}

Yu, Tao ^{[3
]}

Liu, Xiaoyu ^{[4
]}

Yu, Dong ^{[2
]}

机构：

[1] Tencent AI Lab, Shenzhen, Peoples R China

[2] Tencent AI Lab, Bellevue, WA USA

[3] Tencent IEG, Bellevue, WA USA

[4] Tencent IEG, Shenzhen, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

speaker-aware; target speech enhancement; speaker embedding; joint learning;

D O I：

10.1109/icassp40776.2020.9054311

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep learning based speech separation approaches have received great interest, among which the recent speaker-aware speech enhancement methods are promising for solving difficulties such as arbitrary source permutation and unknown number of sources. In this paper, we propose a novel training framework which jointly learns the speaker-conditioned target speaker extraction model and its associated speaker embedding model. The resulting unified model directly learns the appropriate speaker embedding for improved target speech enhancement. We demonstrate, on our large simulated noisy and far-field evaluation sets of overlapped speech signals, that our proposed approach significantly improves the speech enhancement performance compared to the baseline speaker-aware speech enhancement models.

引用

页码：7294 / 7298

页数：5

共 35 条

[1] [Anonymous], 2015, INTERSPEECH
[2] [Anonymous], 2018, ARXIV181004826
[3] [Anonymous], 2018, IEEE ICASSP
[4] [Anonymous], SLT
[5] Chen Zhuo, 2017, Proc IEEE Int Conf Acoust Speech Signal Process, V2017, P246, DOI 10.1109/ICASSP.2017.7952155
[6] SOME EXPERIMENTS ON THE RECOGNITION OF SPEECH, WITH ONE AND WITH 2 EARS
CHERRY, EC
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1953, 25 (05) : 975 - 979
[7] Chung J. S., 2018, INTERSPEECH, DOI DOI 10.21437/INTERSPEECH.2018-1929
[8] Front-End Factor Analysis for Speaker Verification
Dehak, Najim
Kenny, Patrick J.
Dehak, Reda
Dumouchel, Pierre
Ouellet, Pierre
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798
[9] Delcroix M, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5554, DOI 10.1109/ICASSP.2018.8462661
[10] Du J., 2018, abs/1808.10583

← 1 2 3 4 →