REINFORCEMENT LEARNING OF SPEECH RECOGNITION SYSTEM BASED ON POLICY GRADIENT AND HYPOTHESIS SELECTION

被引:0
|
作者
Kato, Taku [1 ]
Shinozaki, Takahiro [1 ]
机构
[1] Tokyo Inst Technol, Sch Engn, Yokohama, Kanagawa, Japan
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
reinforcement learning; policy gradient method; hypothesis selection; deep neural network; speech recognition; CONFIDENCE-MEASURE; ADAPTATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and new tasks. Assuming broad network services for transcribing speech data for many users, a system would become more self-sufficient and more useful if it possessed the ability to learn from very light feedback from the users without annoying them. In this paper, we propose a general reinforcement learning framework for ASR systems based on the policy gradient method. As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method. The proposed framework provides a new view for several existing training and adaptation methods. The experimental results show that the proposed method improves the recognition performance compared to unsupervised adaptation.
引用
收藏
页码:5759 / 5763
页数:5
相关论文
共 50 条
  • [21] On the use of the policy gradient and Hessian in inverse reinforcement learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    INTELLIGENZA ARTIFICIALE, 2020, 14 (01) : 117 - 150
  • [22] Batch Reinforcement Learning With a Nonparametric Off-Policy Policy Gradient
    Tosatto, Samuele
    Carvalho, Joao
    Peters, Jan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 5996 - 6010
  • [23] Adaptive Feature Selection With Reinforcement Learning for Skeleton-Based Action Recognition
    Xu, Zheyuan
    Wang, Yingfu
    Jiang, Jiaqin
    Yao, Jian
    Li, Liang
    IEEE ACCESS, 2020, 8 : 213038 - 213051
  • [24] Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient
    Chung, Hoon
    Lee, Sung Joo
    Jeon, Hyeong Bae
    Park, Jeon Gue
    APPLIED SCIENCES-BASEL, 2020, 10 (10):
  • [25] A reinforcement transfer learning method based on a policy gradient for rolling bearing fault diagnosis
    Wang, Ruixin
    Jiang, Hongkai
    Wu, Zhenghong
    Xu, Jun
    Zhang, Jianjun
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (06)
  • [26] Data-Based Predictive Control via Multistep Policy Gradient Reinforcement Learning
    Yang, Xindi
    Zhang, Hao
    Wang, Zhuping
    Yan, Huaicheng
    Zhang, Changzhu
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) : 2818 - 2828
  • [27] A spacecraft attitude manoeuvre planning algorithm based on improved policy gradient reinforcement learning
    Hua, Bing
    Sun, Shenggang
    Wu, Yunhua
    Chen, Zhiming
    JOURNAL OF NAVIGATION, 2022, 75 (03): : 662 - 684
  • [28] Hessian matrix distribution for Bayesian policy gradient reinforcement learning
    Ngo Anh Vien
    Yu, Hwanjo
    Chung, TaeChoong
    INFORMATION SCIENCES, 2011, 181 (09) : 1671 - 1685
  • [29] Spiking Variational Policy Gradient for Brain Inspired Reinforcement Learning
    Yang, Zhile
    Guo, Shangqi
    Fang, Ying
    Yu, Zhaofei
    Liu, Jian K.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (03) : 1975 - 1990
  • [30] Domain Adapting Deep Reinforcement Learning for Real-World Speech Emotion Recognition
    Rajapakshe, Thejan
    Rana, Rajib
    Khalifa, Sara
    Schuller, Bjoern W.
    IEEE ACCESS, 2024, 12 : 193101 - 193114