A reinforcement learning recommender system using bi-clustering and Markov Decision Process

被引:8
作者
Iftikhar, Arta [1 ]
Ghazanfar, Mustansar Ali [2 ]
Ayub, Mubbashir [1 ]
Alahmari, Saad Ali [3 ]
Qazi, Nadeem [2 ]
Wall, Julie [2 ]
机构
[1] Univ Engn & Technol, Dept Software Engn, Taxila, Pakistan
[2] Univ East London, Dept Comp Sci & Digital Technol, London, England
[3] AL Imam Mohammad Ibn Saud Islamic Univ, Dept Comp Sci, Riyadh, Saudi Arabia
关键词
Reinforcement learning; Markov Decision Process; Bi-clustering; Q-learning; Policy; ALGORITHM; PERSONALIZATION; ACCURACY; IMPROVE;
D O I
10.1016/j.eswa.2023.121541
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Collaborative filtering (CF) recommender systems are static in nature and does not adapt well with changing user preferences. User preferences may change after interaction with a system or after buying a product. Conventional CF clustering algorithms only identifies the distribution of patterns and hidden correlations globally. However, the impossibility of discovering local patterns by these algorithms, headed to the popularization of bi-clustering algorithms. Bi-clustering algorithms can analyze all dataset dimensions simultaneously and consequently, discover local patterns that deliver a better understanding of the underlying hidden correlations. In this paper, we modelled the recommendation problem as a sequential decision-making problem using Markov Decision Processes (MDP). To perform state representation for MDP, we first converted user-item votings matrix to a binary matrix. Then we performed bi-clustering on this binary matrix to determine a subset of similar rows and columns. A bi-cluster merging algorithm is designed to merge similar and overlapping bi-clusters. These biclusters are then mapped to a squared grid (SG). RL is applied on this SG to determine best policy to give recommendation to users. Start state is determined using Improved Triangle Similarity (ITR similarity measure. Reward function is computed as grid state overlapping in terms of users and items in current and prospective next state. A thorough comparative analysis was conducted, encompassing a diverse array of methodologies, including RL-based, pure Collaborative Filtering (CF), and clustering methods. The results demonstrate that our proposed method outperforms its competitors in terms of precision, recall, and optimal policy learning.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] A driving profile recommender system for autonomous driving using sensor data and reinforcement learning
    Chronis, Christos
    Sardianos, Christos
    Varlamis, Iraklis
    Michail, Dimitrios
    Tserpes, Konstantinos
    25TH PAN-HELLENIC CONFERENCE ON INFORMATICS WITH INTERNATIONAL PARTICIPATION (PCI2021), 2021, : 33 - 38
  • [22] Bi-clustering Gene Expression Data Using Co-similarity
    Hussain, Syed Fawad
    ADVANCED DATA MINING AND APPLICATIONS, PT I, 2011, 7120 : 190 - 200
  • [23] Reinforcement Learning for Constrained Markov Decision Processes
    Gattami, Ather
    Bai, Qinbo
    Aggarwal, Vaneet
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [24] Reinforcement Learning in Robust Markov Decision Processes
    Lim, Shiau Hong
    Xu, Huan
    Mannor, Shie
    MATHEMATICS OF OPERATIONS RESEARCH, 2016, 41 (04) : 1325 - 1353
  • [25] A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites
    He, Yongming
    Xing, Lining
    Chen, Yingwu
    Pedrycz, Witold
    Wang, Ling
    Wu, Guohua
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (03): : 1463 - 1474
  • [26] On principle of optimality for safety-constrained Markov Decision Process and p-Safe Reinforcement Learning
    Misra, Rahul
    Wisniewski, Rafal
    IFAC PAPERSONLINE, 2024, 58 (17): : 338 - 343
  • [27] DARES: An Asynchronous Distributed Recommender System Using Deep Reinforcement Learning
    Shi, Bichen
    Tragos, Elias Z.
    Ozsoy, Makbule Gulcin
    Dong, Ruihai
    Hurley, Neil
    Smyth, Barry
    Lawlor, Aonghus
    IEEE ACCESS, 2021, 9 : 83340 - 83354
  • [28] Adaptive personalized recommender system using learning automata and items clustering
    Farahani, Mansoureh Ghiasabadi
    Torkestan, Javad Akbari
    Rahmani, Mohsen
    INFORMATION SYSTEMS, 2022, 106
  • [29] Design Synthesis of Structural Systems as a Markov Decision Process Solved With Deep Reinforcement Learning
    Ororbia, Maximilian E.
    Warn, Gordon P.
    JOURNAL OF MECHANICAL DESIGN, 2023, 145 (06)
  • [30] Optimal Electric Vehicle Charging Strategy With Markov Decision Process and Reinforcement Learning Technique
    Ding, Tao
    Zeng, Ziyu
    Bai, Jiawen
    Qin, Boyu
    Yang, Yongheng
    Shahidehpour, Mohammad
    IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 2020, 56 (05) : 5811 - 5823