IMPROVING CONFIDENCE ESTIMATION ON OUT-OF-DOMAIN DATA FOR END-TO-END SPEECH RECOGNITION

被引：6

作者：

Li, Qiujia ^{[1
]}

Zhang, Yu ^{[2
]}

Qiu, David ^{[2
]}

He, Yanzhang ^{[2
]}

Cao, Liangliang ^{[2
]}

Woodland, Philip C. ^{[1
]}

机构：

[1] Univ Cambridge, Cambridge, England

[2] Google LLC, Mountain View, CA USA

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

confidence scores; end-to-end; automatic speech recognition; out-of-domain;

D O I：

10.1109/ICASSP43922.2022.9746979

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

As end-to-end automatic speech recognition (ASR) models reach promising performance, various downstream tasks rely on good confidence estimators for these systems. Recent research has shown that model-based confidence estimators have a significant advantage over using the output softmax probabilities. If the input data to the speech recogniser is from mismatched acoustic and linguistic conditions, the ASR performance and the corresponding confidence estimators may exhibit severe degradation. Since confidence models are often trained on the same in-domain data as the ASR, generalising to out-of-domain (OOD) scenarios is challenging. By keeping the ASR model untouched, this paper proposes two approaches to improve the model-based confidence estimators on OOD data: using pseudo transcriptions and an additional OOD language model. With an ASR model trained on LibriSpeech, experiments show that the proposed methods can greatly improve the confidence metrics on TED-LIUM and Switchboard datasets while preserving in-domain performance. Furthermore, the improved confidence estimators are better calibrated on OOD data and can provide a much more reliable criterion for data selection.

引用

页码：6537 / 6541

页数：5

共 50 条

[41] End-To-End deep neural models for Automatic Speech Recognition for Polish Language
Pondel-Sycz, Karolina
Pietrzak, Agnieszka Paula
Szymla, Julia
INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2024, 70 (02) : 315 - 321
[42] A Lightweight End-to-End Speech Recognition System on Embedded Devices
Wang, Yu
Nishizaki, Hiromitsu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (07) : 1230 - 1239
[43] End-to-End Mandarin Speech Recognition Combining CNN and BLSTM
Wang, Dong
Wang, Xiaodong
Lv, Shaohe
SYMMETRY-BASEL, 2019, 11 (05):
[44] Tunisian Dialectal End-to-end Speech Recognition based on DeepSpeech
Messaoudi, Abir
Haddad, Hatem
Fourati, Chayma
Hmida, Moez BenHaj
Mabrouk, Aymen Ben Elhaj
Graiet, Mohamed
AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 183 - 190
[45] CYCLE-CONSISTENCY TRAINING FOR END-TO-END SPEECH RECOGNITION
Hori, Takaaki
Astudillo, Ramon
Hayashi, Tomoki
Zhang, Yu
Watanabe, Shinji
Le Roux, Jonathan
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6271 - 6275
[46] VERY DEEP CONVOLUTIONAL NETWORKS FOR END-TO-END SPEECH RECOGNITION
Zhang, Yu
Chan, William
Jaitly, Navdeep
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4845 - 4849
[47] Improved training of end-to-end attention models for speech recognition
Zeyer, Albert
Irie, Kazuki
Schlueter, Ralf
Ney, Hermann
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 7 - 11
[48] Multi-Head Decoder for End-to-End Speech Recognition
Hayashi, Tomoki
Watanabe, Shinji
Toda, Tomoki
Takeda, Kazuya
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 801 - 805
[49] End-to-End Large Vocabulary Speech Recognition for the Serbian Language
Popovic, Branislav
Pakoci, Edvin
Pekar, Darko
SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 343 - 352
[50] END-TO-END MULTI-TALKER OVERLAPPING SPEECH RECOGNITION
Tripathi, Anshuman
Lu, Han
Sak, Hasim
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6129 - 6133

← 1 2 3 4 5 →