Incremental Sparse Bayesian Method for Online Dialog Strategy Learning

被引:8
作者
Lee, Sungjin [1 ]
Eskenazi, Maxine [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Incremental learning; reinforcement learning; sparse Bayesian modeling; statistical dialog modeling; value function approximation;
D O I
10.1109/JSTSP.2012.2229963
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes an incremental sparse Bayesian learning method to allow continuous dialog strategy learning from the interactions with real users. Since conventional reinforcement learning (RL) methods require a huge number of dialogs to reach convergence, it has been essential to use a simulated user in training dialog policies. The disadvantage of this approach is that the trained dialog policies always lag behind the optimal one for live users. In order to tackle this problem, a few studies applying online RL methods to dialog management have emerged and showed very promising results. However, these methods are limited to learning online the weight parameters of the basis functions in the model and so need batch learning on a fixed data set or some heuristics to find appropriate values for other meta parameters such as sparsity-controlling thresholds, basis function parameters, and noise parameters. The proposed method attempts to overcome this limitation to achieve fully incremental and fast dialog strategy learning by adopting a sparse Bayesian learning method for value function approximation. In order to verify the proposed method, three different experimental conditions have been used: artificial data, a simulated user, and real users. The experiment on the artificial data showed that the proposed method successfully learns all the parameters in an incremental manner. Also, the experiment on training and evaluating dialog policies with a simulated user clearly demonstrated that the proposed method is much faster than conventional RL methods. A live user study showed that the dialog strategy learned from real users performed as good as the best past systems, although it slightly underperformed the one trained on simulated dialogs due to the difficulty of user feedback elicitation.
引用
收藏
页码:903 / 916
页数:14
相关论文
共 50 条
  • [1] Lifelong Incremental Reinforcement Learning With Online Bayesian Inference
    Wang, Zhi
    Chen, Chunlin
    Dong, Daoyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 4003 - 4016
  • [2] Online fault detection method based on incremental learning and OCKELM
    Dai J.
    Xu A.
    Shen J.
    Wang S.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2022, 43 (03):
  • [3] Bayesian incremental learning paradigm for online monitoring of dam behavior considering global uncertainty
    Ren, Qiubing
    Li, Heng
    Li, Mingchao
    Kong, Ting
    Guo, Runhao
    APPLIED SOFT COMPUTING, 2023, 143
  • [4] A New Method of Online Fault Diagnosis Based on Incremental Continuous Attribute Naive Bayesian
    Li, Mengting
    Zhao, Shuai
    Chen, Shaowei
    Huang, Dengshan
    2018 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), 2018,
  • [5] An Online Fault Feeder Detection Method Based on Incremental and Federated Learning
    Zhang, Le
    Zhu, Jizhong
    Zhang, Di
    Chen, Yixi
    2023 IEEE/IAS INDUSTRIAL AND COMMERCIAL POWER SYSTEM ASIA, I&CPS ASIA, 2023, : 1161 - 1166
  • [6] Quick online spam classification method based on active and incremental learning
    Feng, Lizhou
    Wang, Youwei
    Zuo, Wanli
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 30 (01) : 17 - 27
  • [7] Prediction of Chaotic Time Series Based on Incremental Method For Bayesian Network Learning
    Li Chun-ying
    Yang You-long
    Zhang Heng-wei
    PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 4245 - 4249
  • [8] Incremental Learning Bayesian Network Structures Efficiently
    Shi, Da
    Tan, Shaohua
    11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2010), 2010, : 1719 - 1724
  • [9] Online network traffic classification with incremental learning
    Loo, H. R.
    Marsono, M. N.
    EVOLVING SYSTEMS, 2016, 7 (02) : 129 - 143
  • [10] Incremental Arbiter learning method
    Zhou, PL
    SEVENTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2001, 66 : 101 - 107