REINFORCEMENT LEARNING OF SPEECH RECOGNITION SYSTEM BASED ON POLICY GRADIENT AND HYPOTHESIS SELECTION

被引:0
|
作者
Kato, Taku [1 ]
Shinozaki, Takahiro [1 ]
机构
[1] Tokyo Inst Technol, Sch Engn, Yokohama, Kanagawa, Japan
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
reinforcement learning; policy gradient method; hypothesis selection; deep neural network; speech recognition; CONFIDENCE-MEASURE; ADAPTATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and new tasks. Assuming broad network services for transcribing speech data for many users, a system would become more self-sufficient and more useful if it possessed the ability to learn from very light feedback from the users without annoying them. In this paper, we propose a general reinforcement learning framework for ASR systems based on the policy gradient method. As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method. The proposed framework provides a new view for several existing training and adaptation methods. The experimental results show that the proposed method improves the recognition performance compared to unsupervised adaptation.
引用
收藏
页码:5759 / 5763
页数:5
相关论文
共 50 条
  • [1] REINFORCEMENT LEARNING BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION
    Shen, Yih-Liang
    Huang, Chao-Yuan
    Wang, Syu-Siang
    Tsao, Yu
    Wang, Hsin-Min
    Chi, Tai-Shih
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6750 - 6754
  • [2] Training a robust reinforcement learning controller for the uncertain system based on policy gradient method
    Li, Zhan
    Xue, Shengri
    Lin, Weiyang
    Tong, Mingsi
    NEUROCOMPUTING, 2018, 316 : 313 - 321
  • [3] Causal Based Action Selection Policy for Reinforcement Learning
    Feliciano-Avelino, Ivan
    Mendez-Molina, Arquimides
    Morales, Eduardo F.
    Enrique Sucar, L.
    ADVANCES IN COMPUTATIONAL INTELLIGENCE (MICAI 2021), PT I, 2021, 13067 : 213 - 227
  • [4] Policy gradient fuzzy reinforcement learning
    Wang, XN
    Xu, X
    He, HG
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 992 - 995
  • [5] KERNEL-BASED LIFELONG POLICY GRADIENT REINFORCEMENT LEARNING
    Mowakeaa, Rami
    Kim, Seung-Jun
    Emge, Darren K.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3500 - 3504
  • [6] Maximum Entropy-Based Reinforcement Learning Using a Confidence Measure in Speech Recognition for Telephone Speech
    Molina, Carlos
    Becerra Yoma, Nestor
    Huenupan, Fernando
    Garreton, Claudio
    Wuth, Jorge
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 1041 - 1052
  • [7] A policy gradient reinforcement learning algorithm with fuzzy function approximation
    Gu, DB
    Yang, EF
    IEEE ROBIO 2004: Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2004, : 936 - 940
  • [8] A closer look at reinforcement learning-based automatic speech recognition
    Yang, Fan
    Yang, Muqiao
    Li, Xiang
    Wu, Yuxuan
    Zhao, Zhiyuan
    Raj, Bhiksha
    Singh, Rita
    COMPUTER SPEECH AND LANGUAGE, 2024, 87
  • [9] Traffic Light Control with Policy Gradient-Based Reinforcement Learning
    Tas, Mehmet Bilge Han
    Ozkan, Kemal
    Saricicek, Inci
    Yazici, Ahmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [10] A Collaborative Multiagent Reinforcement Learning Method Based on Policy Gradient Potential
    Zhang, Zhen
    Ong, Yew-Soon
    Wang, Dongqing
    Xue, Binqiang
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) : 1015 - 1027