On Training a Neural Residual Acoustic Echo Suppressor for Improved ASR

被引：0

作者：

Panchapagesan, Sankaran ^{[1
]}

Shabestary, Turaj Zakizadeh ^{[1
]}

Narayanan, Arun ^{[1
]}

机构：

[1] Google LLC, Mountain View, CA 94043 USA

来源：

INTERSPEECH 2023 | 2023年

关键词：

Acoustic Echo Cancellation; Waveform Neural AEC; Residual Echo Suppression; TasNet; ASR;

D O I：

10.21437/Interspeech.2023-2209

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Acoustic Echo Cancellation (AEC) is critical for accurate recognition of speech directed at a smart device playing audio. Previous work has showed that neural AEC models can significantly improve Automatic Speech Recognition (ASR) accuracy. In this paper, we train a conformer-based waveform-domain neural model to perform residual acoustic echo suppression (RAES) on the output of a linear AEC. We focus specifically on improving ASR accuracy in realistic mismatched test conditions, when training on large-scale simulated training data, as needed for production voice-interaction systems. Our key finding is that instead of naively using the best evaluation-time linear AEC configuration during neural RAES model training, using a weaker linear AEC generalizes significantly better, with 17-30% lower word error rate (WER) on a realistic re-recorded test set. Overall, the neural RAES model yields 38-53% WER reduction over the linear AEC alone.

引用

页码：4019 / 4023

页数：5

共 35 条

[1] Avargel Y., 2006, INT WORKSH AC ECH NO
[2] Improved naive Bayes classification algorithm for traffic risk management
Chen, Hong
Hu, Songhua
Hua, Rui
Zhao, Xiuju
[J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
[3] Nonlinear Residual Echo Suppression Based on Multi-stream Conv-TasNet
Chen, Hongsheng
Xiang, Teng
Chen, Kai
Lu, Jing
[J]. INTERSPEECH 2020, 2020, : 3959 - 3963
[4] Cornell S., 2021, ARXIV211110639
[5] Cutler R., 2022, ICASSP
[6] Deep Multitask Acoustic Echo Cancellation
Fazel, Amin
El-Khamy, Mostafa
Lee, Jungwon
[J]. INTERSPEECH 2019, 2019, : 4250 - 4254
[7] Fazel A, 2020, INT CONF ACOUST SPEE, P6919, DOI [10.1109/ICASSP40776.2020.9053508, 10.1109/icassp40776.2020.9053508]
[8] Conformer: Convolution-augmented Transformer for Speech Recognition
Gulati, Anmol
Qin, James
Chiu, Chung-Cheng
Parmar, Niki
Zhang, Yu
Yu, Jiahui
Han, Wei
Wang, Shibo
Zhang, Zhengdong
Wu, Yonghui
Pang, Ruoming
[J]. INTERSPEECH 2020, 2020, : 5036 - 5040
[9] Hansler E., 2004, Acoustic Echo and Noise Control: A Practical Approach
[10] Howard N., 2021, ICASSP

← 1 2 3 4 →