On Training a Neural Residual Acoustic Echo Suppressor for Improved ASR

被引:0
作者
Panchapagesan, Sankaran [1 ]
Shabestary, Turaj Zakizadeh [1 ]
Narayanan, Arun [1 ]
机构
[1] Google LLC, Mountain View, CA 94043 USA
来源
INTERSPEECH 2023 | 2023年
关键词
Acoustic Echo Cancellation; Waveform Neural AEC; Residual Echo Suppression; TasNet; ASR;
D O I
10.21437/Interspeech.2023-2209
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Acoustic Echo Cancellation (AEC) is critical for accurate recognition of speech directed at a smart device playing audio. Previous work has showed that neural AEC models can significantly improve Automatic Speech Recognition (ASR) accuracy. In this paper, we train a conformer-based waveform-domain neural model to perform residual acoustic echo suppression (RAES) on the output of a linear AEC. We focus specifically on improving ASR accuracy in realistic mismatched test conditions, when training on large-scale simulated training data, as needed for production voice-interaction systems. Our key finding is that instead of naively using the best evaluation-time linear AEC configuration during neural RAES model training, using a weaker linear AEC generalizes significantly better, with 17-30% lower word error rate (WER) on a realistic re-recorded test set. Overall, the neural RAES model yields 38-53% WER reduction over the linear AEC alone.
引用
收藏
页码:4019 / 4023
页数:5
相关论文
共 35 条
  • [1] Avargel Y., 2006, INT WORKSH AC ECH NO
  • [2] Improved naive Bayes classification algorithm for traffic risk management
    Chen, Hong
    Hu, Songhua
    Hua, Rui
    Zhao, Xiuju
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
  • [3] Nonlinear Residual Echo Suppression Based on Multi-stream Conv-TasNet
    Chen, Hongsheng
    Xiang, Teng
    Chen, Kai
    Lu, Jing
    [J]. INTERSPEECH 2020, 2020, : 3959 - 3963
  • [4] Cornell S., 2021, ARXIV211110639
  • [5] Cutler R., 2022, ICASSP
  • [6] Deep Multitask Acoustic Echo Cancellation
    Fazel, Amin
    El-Khamy, Mostafa
    Lee, Jungwon
    [J]. INTERSPEECH 2019, 2019, : 4250 - 4254
  • [7] Fazel A, 2020, INT CONF ACOUST SPEE, P6919, DOI [10.1109/ICASSP40776.2020.9053508, 10.1109/icassp40776.2020.9053508]
  • [8] Conformer: Convolution-augmented Transformer for Speech Recognition
    Gulati, Anmol
    Qin, James
    Chiu, Chung-Cheng
    Parmar, Niki
    Zhang, Yu
    Yu, Jiahui
    Han, Wei
    Wang, Shibo
    Zhang, Zhengdong
    Wu, Yonghui
    Pang, Ruoming
    [J]. INTERSPEECH 2020, 2020, : 5036 - 5040
  • [9] Hansler E., 2004, Acoustic Echo and Noise Control: A Practical Approach
  • [10] Howard N., 2021, ICASSP