REINFORCEMENT LEARNING OF SPEECH RECOGNITION SYSTEM BASED ON POLICY GRADIENT AND HYPOTHESIS SELECTION

被引:0
|
作者
Kato, Taku [1 ]
Shinozaki, Takahiro [1 ]
机构
[1] Tokyo Inst Technol, Sch Engn, Yokohama, Kanagawa, Japan
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
reinforcement learning; policy gradient method; hypothesis selection; deep neural network; speech recognition; CONFIDENCE-MEASURE; ADAPTATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and new tasks. Assuming broad network services for transcribing speech data for many users, a system would become more self-sufficient and more useful if it possessed the ability to learn from very light feedback from the users without annoying them. In this paper, we propose a general reinforcement learning framework for ASR systems based on the policy gradient method. As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method. The proposed framework provides a new view for several existing training and adaptation methods. The experimental results show that the proposed method improves the recognition performance compared to unsupervised adaptation.
引用
收藏
页码:5759 / 5763
页数:5
相关论文
共 50 条
  • [41] Reinforcement Learning With Adaptive Policy Gradient Transfer Across Heterogeneous Problems
    Zhang, Gengzhi
    Feng, Liang
    Wang, Yu
    Li, Min
    Xie, Hong
    Tan, Kay Chen
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (03): : 2213 - 2227
  • [42] English speech recognition based on deep learning with multiple features
    Song, Zhaojuan
    COMPUTING, 2020, 102 (03) : 663 - 682
  • [43] English speech recognition based on deep learning with multiple features
    Zhaojuan Song
    Computing, 2020, 102 : 663 - 682
  • [44] Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning
    Ma, Yan
    Xu, Dengguo
    Huang, Jiashun
    Li, Yahui
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [45] Extended Maximum Actor-Critic Framework Based on Policy Gradient Reinforcement for System Optimization
    Kim, Jung-Hyun
    Choi, Yong-Hoon
    Choi, You-Rak
    Jeong, Jae-Hyeok
    Kim, Min-Suk
    APPLIED SCIENCES-BASEL, 2025, 15 (04):
  • [46] Data Center Selection Based on Reinforcement Learning
    Li, Qirui
    Peng, Zhiping
    Cui, Denglong
    He, Jieguang
    Chen, Ke
    Zhou, Jing
    PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTERNET OF THINGS (CCIOT 2019), 2019, : 14 - 19
  • [47] Reinforcement Learning based Gateway Selection in VANETs
    Alabbas, Hasanain
    Huszak, Arpad
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (03) : 195 - 202
  • [48] Overview of Reinforcement Learning Based on Value and Policy
    Liu, Yun-ting
    Yang, Jia-ming
    Chen, Liang
    Guo, Ting
    Jiang, Yu
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 598 - 603
  • [49] Reinforced knowledge distillation: Multi-class imbalanced classifier based on policy gradient reinforcement learning
    Fan, Saite
    Zhang, Xinmin
    Song, Zhihuan
    NEUROCOMPUTING, 2021, 463 : 422 - 436
  • [50] Deep deterministic policy gradient reinforcement learning based temperature control of a fermentation bioreactor for ethanol production
    Rajasekhar, N.
    Radhakrishnan, T. K.
    Samsudeen, N.
    JOURNAL OF THE INDIAN CHEMICAL SOCIETY, 2025, 102 (02)