A Q-learning algorithm applied to the behavioural decision-making of affective virtual human

被引:0
作者
Zhang, Yiwei [1 ]
Chen, Tianhuang [1 ]
机构
[1] Wuhan Univ Technol, Comp Sci & Technol, 1186 Heping Blvd, Wuhan, Hubei, Peoples R China
来源
2017 19TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATIONS TECHNOLOGY (ICACT) - OPENING NEW ERA OF SMART SOCIETY | 2017年
关键词
Reinforcement learning; Q-learning algorithm; affective virtual human; behavioural decision-making; environmental reward model; affective model;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Traditional Q-Learning algorithm has problems of data transmission lag and its environmental reward model is too simple. It cannot be well applied to the reinforcement learning of affective virtual human behaviour decision. Analogizing the thought of human' s self-reflection in this paper, a improved Q-learning algorithm is proposed, which can be easily applied in behavioural decision-making of affective virtual human. The Q-learning algorithm in this paper not only strengthens the behaviour strategy with better learning cycle and weakens the behaviour strategy with worse learning cycle by the way of self reflection reward, but also picks up the speed of the effect of behavioural decision feedback to state-action pair in a learning cycle, thus improving the convergence rate of Q-learning algorithm in affective virtual human ' s behavioural decision making. The algorithm aims at helping affective virtual human carry out path optimization in a two-dimensional grid environment in the simulation test. The results show that the improved Q-learning algorithm is significantly faster than the traditional Q-learning algorithm in achieving the optimal control strategy with an average of 43.7 learning cycles. The validity of the algorithm is verified.
引用
收藏
页码:403 / 407
页数:5
相关论文
共 14 条
  • [1] Current Emotion Research in Organizational Behavior
    Ashkanasy, Neal M.
    Humphrey, Ronald H.
    [J]. EMOTION REVIEW, 2011, 3 (02) : 214 - 224
  • [2] Chen Xue-song, 2010, Application Research of Computers, V27, P2834, DOI 10.3969/j.issn.1001-3695.2010.08.006
  • [3] EMIA: Emotion Model for Intelligent Agent
    Jain, Shikha
    Asawa, Krishna
    [J]. JOURNAL OF INTELLIGENT SYSTEMS, 2015, 24 (04) : 449 - 465
  • [4] Pan Zhigeng, 2007, Journal of Computer Aided Design & Computer Graphics, V19, P1509
  • [5] Spence S., 1995, BMJ, V310, P1213, DOI [10.1136/bmj.310.6988, DOI 10.1136/BMJ.310.6988]
  • [6] Terada K, 2012, ARTICIFICIAL EMOTION, P314
  • [7] [王志良 Wang Zhiliang], 2011, [计算机科学, Computer Science], V38, P34
  • [8] WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698
  • [9] Xu Ya, 2013, RES PATH PLANNING MO
  • [10] [闫友彪 Yan Youbiao], 2004, [计算机应用研究, Application Research of Computers], V21, P4