END TO END SPEECH RECOGNITION ERROR PREDICTION WITH SEQUENCE TO SEQUENCE LEARNING

被引:0
|
作者
Serai, Prashant [1 ]
Stiff, Adam [1 ]
Fosler-Lussier, Eric [1 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
美国国家科学基金会;
关键词
Speech Recognition; Error Prediction; Low Resource; Sequence to Sequence Neural Networks; Simulated ASR Errors;
D O I
10.1109/icassp40776.2020.9054398
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Simulating the errors made by a speech recognizer on plain text has proven useful to help train downstream NLP tasks to be robust to real ASR errors at test time. Prior work in this domain has focused on modeling confusions at the phonetic level, and using a lexicon to convert from words to phones and back, usually accompanied by an FST Language model. We present a novel end to end model to simulate ASR errors. Our approach trains a convolutional sequence to sequence model to take as direct input a word sequence and predict a word sequence as an output. The end to end modeling improves prior published results for recall of recognition errors made by a Switchboard ASR system on unseen Fisher data; we also demonstrate cross-domain robustness by predicting errors made by an unrelated cloud-based ASR system on a Virtual Patient task.
引用
收藏
页码:6339 / 6343
页数:5
相关论文
共 50 条
  • [41] Trainable Dynamic Subsampling for End-to-End Speech Recognition
    Zhang, Shucong
    Loweimi, Erfan
    Xu, Yumo
    Bell, Peter
    Renals, Steve
    INTERSPEECH 2019, 2019, : 1413 - 1417
  • [42] MULTILINGUAL SPEECH RECOGNITION WITH A SINGLE END-TO-END MODEL
    Toshniwal, Shubham
    Sainath, Tara N.
    Weiss, Ron J.
    Li, Bo
    Moreno, Pedro
    Weinstein, Eugene
    Rao, Kanishka
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4904 - 4908
  • [43] NIESR: Nuisance Invariant End-to-end Speech Recognition
    Hsu, I-Hung
    Jaiswal, Ayush
    Natarajan, Premkumar
    INTERSPEECH 2019, 2019, : 456 - 460
  • [44] COMBINING END-TO-END AND ADVERSARIAL TRAINING FOR LOW-RESOURCE SPEECH RECOGNITION
    Drexler, Jennifer
    Glass, James
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 361 - 368
  • [45] Curriculum Learning based approaches for robust end-to-end far-field speech recognition
    Ranjan, Shivesh
    Hansen, John H. L.
    SPEECH COMMUNICATION, 2021, 132 : 123 - 131
  • [46] A SPELLING CORRECTION MODEL FOR END-TO-END SPEECH RECOGNITION
    Guo, Jinxi
    Sainath, Tara N.
    Weiss, Ron J.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5651 - 5655
  • [47] Domain Expansion for End-to-End Speech Recognition: Applications for Accent/Dialect Speech
    Ghorbani, Shahram
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 762 - 774
  • [48] CORRECTION OF AUTOMATIC SPEECH RECOGNITION WITH TRANSFORMER SEQUENCE-TO-SEQUENCE MODEL
    Hrinchuk, Oleksii
    Popova, Mariya
    Ginsburg, Boris
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7074 - 7078
  • [49] Improved training of end-to-end attention models for speech recognition
    Zeyer, Albert
    Irie, Kazuki
    Schlueter, Ralf
    Ney, Hermann
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 7 - 11
  • [50] CYCLE-CONSISTENCY TRAINING FOR END-TO-END SPEECH RECOGNITION
    Hori, Takaaki
    Astudillo, Ramon
    Hayashi, Tomoki
    Zhang, Yu
    Watanabe, Shinji
    Le Roux, Jonathan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6271 - 6275