Knowledge-Guided Adaptive Sequence Reinforcement Learning Model

被引:0
|
作者
Li Y. [1 ]
Tong X. [1 ]
机构
[1] School of Computer and Control Engineering, Yantai University, Yantai
基金
中国国家自然科学基金;
关键词
Adaptive Sequence; Deep Reinforcement Learning; Knowledge Graph; Recurrent Neural Network; Self-Attention Mechanism;
D O I
10.16451/j.cnki.issn1003-6059.202302002
中图分类号
学科分类号
摘要
The sequence recommendation can be formalized as a Markov decision process and then transformed into a deep reinforcement learning problem. Mining critical information from user sequences is a key step, such as preference drift and dependencies between sequences. In most current deep reinforcement learning recommendation systems, a fixed sequence length is taken as the input. Inspired by knowledge graphs, a knowledge-guided adaptive sequence reinforcement learning model is proposed. Firstly, using the entity relationship of the knowledge graph, a partial sequence is intercepted from the complete user feedback sequence as a drift sequence. The item set in the drift sequence represents the user′s current preference, and the sequence length represents the user′s preference change speed. Then, a gated recurrent unit is utilized to extract the user′s preference changes and dependencies between items, while the self-attention mechanism selectively focuses on key item information. Finally, a compound reward function is designed, including discount sequence rewards and knowledge graph rewards, to alleviate the problem of sparse reward. Experiments on four real-world datasets demonstrate that the proposed model achieves superior recommendation accuracy. © 2023 Journal of Pattern Recognition and Artificial Intelligence. All rights reserved.
引用
收藏
页码:108 / 119
页数:11
相关论文
共 38 条
  • [1] HAN S, WANG H., Intelligent File Recommendation Based on Time Access Tracking, Journal of Software, 20, pp. 59-65, (2009)
  • [2] LIU X D, CHEN D R, WANG H M., A User-Based and Item-Based Collaborative Filtering Recommendation Algorithm, Journal of WUT(Information and Management Engineering), 32, 4, pp. 550-553, (2010)
  • [3] DENG A L, ZHU Y Y, SHI B L., A Collaborative Filtering Recommendation Algorithm Based on Item Rating Prediction, Journal of Software, 14, 9, pp. 1621-1628, (2003)
  • [4] NGUYEN J, ZHU M., Content-Boosted Matrix Factorization Techniques for Recommender Systems, Statistical Analysis and Data Mining(Special Issue on Statistical Learning), 6, 4, pp. 286-301, (2013)
  • [5] TONG X R, JIANG X X, WANG Y J, Et al., Research on the Formation of Trust Network and Its Applications in Intelligent Recommender Systems, Journal of Chinese Computer Systems, 38, 1, pp. 92-98, (2017)
  • [6] LIU Q, ZHAI J W, ZHANG Z C, Et al., A Survey on Deep Reinforcement Learning, Chinese Journal of Computers, 41, 1, pp. 1-27, (2018)
  • [7] LILLICRAP T P, HUNT J J, PRITZEL A, Et al., Continuous Control with Deep Reinforcement Learning
  • [8] COHEN A, YU L, WRIGHT R., Diverse Exploration for Fast and Safe Policy Improvement, Proc of the 32nd AAAI Conference on Artificial Intelligence and 30th Innovative Application of Artificial Intelligence Conference and 8th AAAI Symposium on Educational Advances in Artificial Intelligence, pp. 2876-2883, (2018)
  • [9] MNIH V, KAVUKCUOGLU K, SILVER D, Et al., Human-Level Control through Deep Reinforcement Learning, Nature, 518, 7540, pp. 529-533, (2015)
  • [10] AFSAR M M, CRUMP T, FAR B., Reinforcement Learning Based Recommender Systems: A Survey, ACM Computing Surveys, 55, (2022)