SUBWORD REGULARIZATION AND BEAM SEARCH DECODING FOR END-TO-END AUTOMATIC SPEECH RECOGNITION

被引：0

作者：

Drexler, Jennifer ^{[1
]}

Glass, James ^{[1
]}

机构：

[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

automatic speech recognition; subword units; beam search; CTC; attention;

D O I：

10.1109/icassp.2019.8683531

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we experiment with the recently introduced subword regularization technique [ 1] in the context of end-to-end automatic speech recognition ( ASR). We present results from both attention-based and CTC-based ASR systems on two common benchmark datasets, the 80 hour Wall Street Journal corpus and 1,000 hour Librispeech corpus. We also introduce a novel subword beam search decoding algorithm that significantly improves the final performance of the CTC-based systems. Overall, we find that subword regularization improves the performance of both types of ASR systems, with the regularized attention-based model performing best overall.

引用

页码：6266 / 6270

页数：5

共 50 条

[1] Subword Regularization: An Analysis of Scalability and Generalization for End-to-End Automatic Speech Recognition
Lakomkin, Egor
Heymann, Jahn
Sklyar, Ilya
Wiesler, Simon
INTERSPEECH 2020, 2020, : 3600 - 3604
[2] An Overview of End-to-End Automatic Speech Recognition
Wang, Dong
Wang, Xiaodong
Lv, Shaohe
SYMMETRY-BASEL, 2019, 11 (08):
[3] Recent Advances in End-to-End Automatic Speech Recognition
Li, Jinyu
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2022, 11 (01)
[4] STRUCTURED SPARSE ATTENTION FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
Xue, Jiabin
Zheng, Tieran
Han, Jiqing
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7044 - 7048
[5] Inverted Alignments for End-to-End Automatic Speech Recognition
Doetsch, Patrick
Hannemann, Mirko
Schluter, Ralf
Ney, Hermann
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2017, 11 (08) : 1265 - 1273
[6] INCREMENTAL LEARNING FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
Fu, Li
Li, Xiaoxiao
Zi, Libo
Zhang, Zhengchen
Wu, Youzheng
He, Xiaodong
Zhou, Bowen
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 320 - 327
[7] Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition
Sun, Sining
Guo, Pengcheng
Xie, Lei
Hwang, Mei-Yuh
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1826 - 1838
[8] SFA: Searching faster architectures for end-to-end automatic speech recognition models
Liu, Yukun
Li, Ta
Zhang, Pengyuan
Yan, Yonghong
COMPUTER SPEECH AND LANGUAGE, 2023, 81
[9] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Parcollet, Titouan
Zhang, Ying
Morchid, Mohamed
Trabelsi, Chiheb
Linares, Georges
De Mori, Renato
Bengio, Yoshua
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26
[10] Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
Shakeel, Muhammad
Sudo, Yui
Peng, Yifan
Watanabe, Shinji
INTERSPEECH 2024, 2024, : 3909 - 3913

← 1 2 3 4 5 →