Factual Consistency Oriented Speech Recognition

被引:1
|
作者
Kanda, Naoyuki [1 ]
Yoshioka, Takuya [1 ]
Liu, Yang [1 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
来源
INTERSPEECH 2023 | 2023年
关键词
speech recognition; speech summarization; hallucination errors; ASR;
D O I
10.21437/Interspeech.2023-485
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel optimization framework for automatic speech recognition (ASR) with the aim of reducing hallucinations produced by an ASR model. The proposed framework optimizes the ASR model to maximize an expected factual consistency score between ASR hypotheses and groundtruth transcriptions, where the factual consistency score is computed by a separately trained estimator. Experimental results using the AMI meeting corpus and the VoxPopuli corpus show that the ASR model trained with the proposed framework generates ASR hypotheses that have significantly higher consistency scores with ground-truth transcriptions while maintaining the word error rates close to those of cross entropy-trained ASR models. Furthermore, it is shown that training the ASR models with the proposed framework improves the speech summarization quality as measured by the factual consistency of meeting conversation summaries generated by a large language model.
引用
收藏
页码:236 / 240
页数:5
相关论文
共 50 条
  • [31] TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
    Gekhman, Zorik
    Herzig, Jonathan
    Aharoni, Roee
    Elkind, Chen
    Szpektor, Idan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2053 - 2070
  • [32] Object oriented reuse: Experience in developing a framework for speech recognition applications
    Srinivasan, S
    Vergo, J
    PROCEEDINGS OF THE 1998 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 1998, : 322 - 330
  • [33] WeCheck: Strong Factual Consistency Checker viaWeakly Supervised Learning
    Wu, Wenhao
    Li, Wei
    Xiao, Xinyan
    Liu, Jiachen
    Li, Sujian
    Lyu, Yajuan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 307 - 321
  • [34] Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
    Xie, Yuexiang
    Sun, Fei
    Deng, Yang
    Li, Yaliang
    Ding, Bolin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 100 - 110
  • [35] Acoustic Phonetic Decoding Oriented to Multilingual Speech Recognition in the Basque Context
    Barroso, N.
    Lopez de Ipina, K.
    Ezeiza, A.
    TRENDS IN PRACTICAL APPLICATIONS OF AGENTS AND MULTIAGENT SYSTEMS, 2010, 71 : 697 - +
  • [36] On Improving Summarization Factual Consistency from Natural Language Feedback
    Liu, Yixin
    Deb, Budhaditya
    Teruel, Milagro
    Halfaker, Aaron
    Radev, Dragomir
    Awadallah, Ahmed H.
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15144 - 15161
  • [37] Entity-level Factual Consistency of Abstractive Text Summarization
    Nan, Feng
    Nallapati, Ramesh
    Wang, Zhiguo
    dos Santos, Cicero Nogueira
    Zhu, Henghui
    Zhang, Dejiao
    McKeown, Kathleen
    Xiang, Bing
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2727 - 2733
  • [38] ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
    Ha, Jung-Woo
    Nam, Kihyun
    Kang, Jingu
    Lee, Sang-Woo
    Yang, Sohee
    Jung, Hyunhoon
    Kim, Hyeji
    Kim, Eunmi
    Kim, Soojin
    Kim, Hyun Ah
    Doh, Kyoungtae
    Lee, Chan Kyu
    Sung, Nako
    Kim, Sunghun
    INTERSPEECH 2020, 2020, : 409 - 413
  • [39] Using Data Augmentation and Consistency Regularization to Improve Semi-supervised Speech Recognition
    Sapru, Ashtosh
    INTERSPEECH 2022, 2022, : 5115 - 5119
  • [40] Evaluating the Factual Consistency of Large Language Models Through News Summarization
    Tam, Derek
    Mascarenhas, Anisha
    Zhang, Shiyue
    Kwan, Sarah
    Bansal, Mohit
    Raffel, Colin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5220 - 5255