CONTEXT-AWARE NEURAL CONFIDENCE ESTIMATION FOR RARE WORD SPEECH RECOGNITION

被引:1
|
作者
Qiu, David [1 ]
Munkhdalai, Tsendsuren [1 ]
He, Yanzhang [1 ]
Sim, Khe Chai [1 ]
机构
[1] Google LLC, Mountain View, CA USA
来源
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年
关键词
Confidence estimation; contextual biasing; end-to-end speech recognition; neural associative memory;
D O I
10.1109/SLT54892.2023.10023411
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Confidence estimation for automatic speech recognition (ASR) is important for many downstream tasks. Recently, neural confidence estimation models (CEMs) have been shown to produce accurate confidence scores for predicting word-level errors. These models are built on top of an end-to-end (E2E) ASR and the acoustic embeddings are part of the input features. However, practical E2E ASR systems often incorporate contextual information in the decoder to improve rare word recognition. The CEM is not aware of this and underestimates the confidence of the rare words that have been corrected by the context. In this paper, we propose a context-aware CEM by incorporating context into the encoder using a neural associative memory (NAM) model. It uses attention to detect for presence of the biasing phrases and modify the encoder features. Experiments show that the proposed context-aware CEM using NAM augmented training can improve the AUC-ROC for word error prediction from 0.837 to 0.892.
引用
收藏
页码:31 / 37
页数:7
相关论文
共 50 条
  • [1] Context-Aware Confidence Estimation for Rejection in Handwritten Chinese Text Recognition
    Liu, Yangyang
    Chen, Yi
    Yin, Fei
    Liu, Cheng-Lin
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT I, 2024, 14804 : 134 - 151
  • [2] VISUAL FEATURES FOR CONTEXT-AWARE SPEECH RECOGNITION
    Gupta, Abhinav
    Miao, Yajie
    Neves, Leonardo
    Metze, Florian
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5020 - 5024
  • [3] CONTEXT-AWARE TRANSFORMER TRANSDUCER FOR SPEECH RECOGNITION
    Chang, Feng-Ju
    Liu, Jing
    Radfar, Martin
    Mouchtaris, Athanasios
    Omologo, Maurizio
    Rastrow, Ariya
    Kunzmann, Siegfried
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 503 - 510
  • [4] CONTEXT-AWARE ATTENTION MECHANISM FOR SPEECH EMOTION RECOGNITION
    Ramet, Gaetan
    Garner, Philip N.
    Baeriswyl, Michael
    Lazaridis, Alexandros
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 126 - 131
  • [5] Context-aware RNNLM Rescoring for Conversational Speech Recognition
    Wei, Kun
    Guo, Pengcheng
    Lv, Hang
    Tu, Zhen
    Xie, Lei
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [6] Speech Dereverberation With Context-Aware Recurrent Neural Networks
    Santos, Joao Felipe
    Falk, Tiago H.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (07) : 1232 - 1242
  • [7] Context-Aware Speech Recognition Using Prompts for Language Learners
    Cheng, Jian
    INTERSPEECH 2024, 2024, : 4009 - 4013
  • [8] Evaluating confidence in context for context-aware security
    Lacoste, Marc
    Privat, Gilles
    Ramparany, Fano
    AMBIENT INTELLIGENCE, PROCEEDINGS, 2007, 4794 : 211 - 229
  • [9] Context-Aware Confidence Sets for Fine-Grained Product Recognition
    Baz, Ipek
    Yoruk, Erdem
    Cetin, Mujdat
    IEEE ACCESS, 2019, 7 : 76376 - 76393
  • [10] Context-Aware Recognition of Drivable Terrain with Automated Parameters Estimation
    Wietrzykowski, Jan
    Skrzypczynski, Piotr
    INTELLIGENT AUTONOMOUS SYSTEMS 15, IAS-15, 2019, 867 : 626 - 638