SUBWORD REGULARIZATION AND BEAM SEARCH DECODING FOR END-TO-END AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Drexler, Jennifer [1 ]
Glass, James [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
automatic speech recognition; subword units; beam search; CTC; attention;
D O I
10.1109/icassp.2019.8683531
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we experiment with the recently introduced subword regularization technique [ 1] in the context of end-to-end automatic speech recognition ( ASR). We present results from both attention-based and CTC-based ASR systems on two common benchmark datasets, the 80 hour Wall Street Journal corpus and 1,000 hour Librispeech corpus. We also introduce a novel subword beam search decoding algorithm that significantly improves the final performance of the CTC-based systems. Overall, we find that subword regularization improves the performance of both types of ASR systems, with the regularized attention-based model performing best overall.
引用
收藏
页码:6266 / 6270
页数:5
相关论文
共 50 条
  • [1] Subword Regularization: An Analysis of Scalability and Generalization for End-to-End Automatic Speech Recognition
    Lakomkin, Egor
    Heymann, Jahn
    Sklyar, Ilya
    Wiesler, Simon
    INTERSPEECH 2020, 2020, : 3600 - 3604
  • [2] An Overview of End-to-End Automatic Speech Recognition
    Wang, Dong
    Wang, Xiaodong
    Lv, Shaohe
    SYMMETRY-BASEL, 2019, 11 (08):
  • [3] Recent Advances in End-to-End Automatic Speech Recognition
    Li, Jinyu
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2022, 11 (01)
  • [4] STRUCTURED SPARSE ATTENTION FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
    Xue, Jiabin
    Zheng, Tieran
    Han, Jiqing
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7044 - 7048
  • [5] Inverted Alignments for End-to-End Automatic Speech Recognition
    Doetsch, Patrick
    Hannemann, Mirko
    Schluter, Ralf
    Ney, Hermann
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2017, 11 (08) : 1265 - 1273
  • [6] INCREMENTAL LEARNING FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
    Fu, Li
    Li, Xiaoxiao
    Zi, Libo
    Zhang, Zhengchen
    Wu, Youzheng
    He, Xiaodong
    Zhou, Bowen
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 320 - 327
  • [7] Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition
    Sun, Sining
    Guo, Pengcheng
    Xie, Lei
    Hwang, Mei-Yuh
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1826 - 1838
  • [8] SFA: Searching faster architectures for end-to-end automatic speech recognition models
    Liu, Yukun
    Li, Ta
    Zhang, Pengyuan
    Yan, Yonghong
    COMPUTER SPEECH AND LANGUAGE, 2023, 81
  • [9] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
    Parcollet, Titouan
    Zhang, Ying
    Morchid, Mohamed
    Trabelsi, Chiheb
    Linares, Georges
    De Mori, Renato
    Bengio, Yoshua
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26
  • [10] Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
    Shakeel, Muhammad
    Sudo, Yui
    Peng, Yifan
    Watanabe, Shinji
    INTERSPEECH 2024, 2024, : 3909 - 3913