REINFORCEMENT LEARNING OF SPEECH RECOGNITION SYSTEM BASED ON POLICY GRADIENT AND HYPOTHESIS SELECTION

被引:0
|
作者
Kato, Taku [1 ]
Shinozaki, Takahiro [1 ]
机构
[1] Tokyo Inst Technol, Sch Engn, Yokohama, Kanagawa, Japan
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
reinforcement learning; policy gradient method; hypothesis selection; deep neural network; speech recognition; CONFIDENCE-MEASURE; ADAPTATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and new tasks. Assuming broad network services for transcribing speech data for many users, a system would become more self-sufficient and more useful if it possessed the ability to learn from very light feedback from the users without annoying them. In this paper, we propose a general reinforcement learning framework for ASR systems based on the policy gradient method. As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method. The proposed framework provides a new view for several existing training and adaptation methods. The experimental results show that the proposed method improves the recognition performance compared to unsupervised adaptation.
引用
收藏
页码:5759 / 5763
页数:5
相关论文
共 50 条
  • [31] A Spoken English Teaching System Based on Speech Recognition and Machine Learning
    Jiao, Fengming
    Song, Jiao
    Zhao, Xin
    Zhao, Ping
    Wang, Ru
    INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2021, 16 (14) : 68 - 82
  • [32] A Mandarin E-Learning System Based on Speech Recognition and Evaluation
    Ming, Yue
    Bai, Zongshan
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2011, 19 (04) : 651 - 659
  • [33] Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning
    Yang, Xindi
    Zhang, Hao
    Wang, Zhuping
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3872 - 3883
  • [34] Deterministic Policy Gradient based Reinforcement Learning for Current Control of Hybrid Active Power Filter
    Gong, Cheng
    Leong, Chio-Hong
    Lam, Chi-Seng
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [35] Learning of Soccer Player Agents Using a Policy Gradient Method: Pass Selection
    Igarashi, Harukazu
    Fukuoka, Hitoshi
    Ishihara, Seiji
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 31 - +
  • [36] An iterative gradient descent-based reinforcement learning policy for active control of structural vibrations
    Panda, Jagajyoti
    Chopra, Mudit
    Matsagar, Vasant
    Chakraborty, Souvik
    COMPUTERS & STRUCTURES, 2024, 290
  • [37] Policy-Gradient-Based Reinforcement Learning for Computing Resources Allocation in O-RAN
    Sharara, Mahdi
    Pamuklu, Turgay
    Hoteit, Sahar
    Veque, Veronique
    Erol-Kantarci, Melike
    PROCEEDINGS OF THE 2022 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (IEEE CLOUDNET 2022), 2022, : 229 - 236
  • [38] Mini-batch sample selection strategies for deep learning based speech recognition
    Dokuz, Yesim
    Tufekci, Zekeriya
    APPLIED ACOUSTICS, 2021, 171
  • [39] Emotion-detecting based model selection for emotional speech recognition
    Pan, Y. C.
    Xu, M. X.
    Liu, L. Q.
    Jia, P. F.
    2006 IMACS: MULTICONFERENCE ON COMPUTATIONAL ENGINEERING IN SYSTEMS APPLICATIONS, VOLS 1 AND 2, 2006, : 2169 - +
  • [40] Efficient data selection for speech recognition based on prior confidence estimation
    Kobashikawa, Satoshi
    Asami, Taichi
    Yamaguchi, Yoshikazu
    Masataki, Hirokazu
    Takahashi, Satoshi
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2011, 32 (04) : 151 - 153