GRADIENT-BASED ACTIVE LEARNING QUERY STRATEGY FOR END-TO-END SPEECH RECOGNITION

被引：0

作者：

Yuan, Yang ^{[1
,2
]}

Chung, Soo-Whan ^{[1
]}

Kang, Hong-Goo ^{[1
]}

机构：

[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea

[2] Naver Corp, Seongnam, South Korea

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

Active learning; deep learning; combined query strategy; automatic speech recognition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose an effective active learning query strategy for an automatic speech recognition system with the aim of reducing the training cost. Generally, training a deep neural network with supervised learning requires a massive amount of labeled data to obtain excellent performance. However, labeling data is tedious and costly manual work. Active learning can solve this problem by choosing and only annotating informative instances, which presents better results even with less transcribed data. In this approach it is vitally important to accurately select informative samples. Based on the preliminary experiment results that true gradient length has the best performance in terms of measuring sample informativeness in ideal conditions, we propose utilizing both uncertainty and the expected gradient length criterion to approximate the true gradient length using a neural network. The experiment results show that our proposed method is superior to the conventional individual criterion when applied to a phoneme-based speech recognition system, and it has both a faster convergence speed and the greatest loss reduction in both clean and noisy conditions.

引用

页码：2832 / 2836

页数：5

共 50 条

[1] Active Learning Methods for Low Resource End-To-End Speech Recognition
Malhotra, Karan
Bansal, Shubham
Ganapathy, Sriram
INTERSPEECH 2019, 2019, : 2215 - 2219
[2] Loss Prediction: End-to-End Active Learning Approach For Speech Recognition
Luo, Jian
Wang, Jianzong
Cheng, Ning
Xiao, Jing
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[3] INCREMENTAL LEARNING FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
Fu, Li
Li, Xiaoxiao
Zi, Libo
Zhang, Zhengchen
Wu, Youzheng
He, Xiaodong
Zhou, Bowen
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 320 - 327
[4] IMPROVING END-TO-END SPEECH RECOGNITION WITH POLICY LEARNING
Zhou, Yingbo
Xiong, Caiming
Socher, Richard
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5819 - 5823
[5] Towards end-to-end speech recognition with transfer learning
Chu-Xiong Qin
Dan Qu
Lian-Hai Zhang
EURASIP Journal on Audio, Speech, and Music Processing, 2018
[6] Towards end-to-end speech recognition with transfer learning
Qin, Chu-Xiong
Qu, Dan
Zhang, Lian-Hai
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
[7] Continual Learning for Monolingual End-to-End Automatic Speech Recognition
Vander Eeckt, Steven
Van Hamme, Hugo
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 459 - 463
[8] End-to-End Audiovisual Speech Recognition System With Multitask Learning
Tao, Fei
Busso, Carlos
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1 - 11
[9] End-to-End Speech Recognition Sequence Training With Reinforcement Learning
Tjandra, Andros
Sakti, Sakriani
Nakamura, Satoshi
IEEE ACCESS, 2019, 7 : 79758 - 79769
[10] Arabic speech recognition using end-to-end deep learning
Alsayadi, Hamzah A.
Abdelhamid, Abdelaziz A.
Hegazy, Islam
Fayed, Zaki T.
IET SIGNAL PROCESSING, 2021, 15 (08) : 521 - 534

← 1 2 3 4 5 →