Factual Consistency Oriented Speech Recognition

被引：1

作者：

Kanda, Naoyuki ^{[1
]}

Yoshioka, Takuya ^{[1
]}

Liu, Yang ^{[1
]}

机构：

[1] Microsoft, Redmond, WA 98052 USA

来源：

INTERSPEECH 2023 | 2023年

关键词：

speech recognition; speech summarization; hallucination errors; ASR;

D O I：

10.21437/Interspeech.2023-485

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a novel optimization framework for automatic speech recognition (ASR) with the aim of reducing hallucinations produced by an ASR model. The proposed framework optimizes the ASR model to maximize an expected factual consistency score between ASR hypotheses and groundtruth transcriptions, where the factual consistency score is computed by a separately trained estimator. Experimental results using the AMI meeting corpus and the VoxPopuli corpus show that the ASR model trained with the proposed framework generates ASR hypotheses that have significantly higher consistency scores with ground-truth transcriptions while maintaining the word error rates close to those of cross entropy-trained ASR models. Furthermore, it is shown that training the ASR models with the proposed framework improves the speech summarization quality as measured by the factual consistency of meeting conversation summaries generated by a large language model.

引用

页码：236 / 240

页数：5

共 50 条

[31] TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
Gekhman, Zorik
Herzig, Jonathan
Aharoni, Roee
Elkind, Chen
Szpektor, Idan
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2053 - 2070
[32] Object oriented reuse: Experience in developing a framework for speech recognition applications
Srinivasan, S
Vergo, J
PROCEEDINGS OF THE 1998 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 1998, : 322 - 330
[33] WeCheck: Strong Factual Consistency Checker viaWeakly Supervised Learning
Wu, Wenhao
Li, Wei
Xiao, Xinyan
Liu, Jiachen
Li, Sujian
Lyu, Yajuan
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 307 - 321
[34] Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
Xie, Yuexiang
Sun, Fei
Deng, Yang
Li, Yaliang
Ding, Bolin
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 100 - 110
[35] Acoustic Phonetic Decoding Oriented to Multilingual Speech Recognition in the Basque Context
Barroso, N.
Lopez de Ipina, K.
Ezeiza, A.
TRENDS IN PRACTICAL APPLICATIONS OF AGENTS AND MULTIAGENT SYSTEMS, 2010, 71 : 697 - +
[36] On Improving Summarization Factual Consistency from Natural Language Feedback
Liu, Yixin
Deb, Budhaditya
Teruel, Milagro
Halfaker, Aaron
Radev, Dragomir
Awadallah, Ahmed H.
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15144 - 15161
[37] Entity-level Factual Consistency of Abstractive Text Summarization
Nan, Feng
Nallapati, Ramesh
Wang, Zhiguo
dos Santos, Cicero Nogueira
Zhu, Henghui
Zhang, Dejiao
McKeown, Kathleen
Xiang, Bing
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2727 - 2733
[38] ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Ha, Jung-Woo
Nam, Kihyun
Kang, Jingu
Lee, Sang-Woo
Yang, Sohee
Jung, Hyunhoon
Kim, Hyeji
Kim, Eunmi
Kim, Soojin
Kim, Hyun Ah
Doh, Kyoungtae
Lee, Chan Kyu
Sung, Nako
Kim, Sunghun
INTERSPEECH 2020, 2020, : 409 - 413
[39] Using Data Augmentation and Consistency Regularization to Improve Semi-supervised Speech Recognition
Sapru, Ashtosh
INTERSPEECH 2022, 2022, : 5115 - 5119
[40] Evaluating the Factual Consistency of Large Language Models Through News Summarization
Tam, Derek
Mascarenhas, Anisha
Zhang, Shiyue
Kwan, Sarah
Bansal, Mohit
Raffel, Colin
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5220 - 5255

← 1 2 3 4 5 →