Reinforcement Learning Based on Contextual Bandits for Personalized Online Learning Recommendation Systems

被引:31
作者
Intayoad, Wacharawan [1 ]
Kamyod, Chayapol [1 ]
Temdee, Punnarumol [1 ]
机构
[1] Mae Fah Luang Univ, Comp & Commun Engn Capac Bldg Res Unit, Sch Informat Technol, Chiang Rai 57100, Thailand
关键词
Reinforcement learning; Personalized learning; Recommendation; STYLES;
D O I
10.1007/s11277-020-07199-0
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Personalized online learning has been significantly adopted in recent years and become a potential instructional strategy in online learning. The promising way to provide personalized online learning is personalized recommendation by navigating students to suitable learning contents at the right time. However, this is a nontrivial problem as the learning environments are considered as a high degree of flexibility as students independently learn according to their characteristics, and situations. Existing recommendation methods do not work effectively in such environment. Therefore, our objective of this study is to provide personalized dynamic and continuous recommendation for online learning systems. We propose the method that is based on the contextual bandits and reinforcement learning problems which work effectively in a dynamic environment. Moreover, we propose to use the past student behaviors and current student state as the contextual information to create the policy for the reinforcement agent to make the optimal decision. We deploy real data from an online learning system to evaluate our proposed method. The proposed method is compared with the well-known methods in reinforcement learning problems, i.e. epsilon-greedy, greedy optimistic initial value, and upper bound confidence methods. The results depict that our proposed method significantly performs better than those benchmarking methods in our case test.
引用
收藏
页码:2917 / 2932
页数:16
相关论文
共 23 条
[1]   Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions [J].
Adomavicius, G ;
Tuzhilin, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (06) :734-749
[2]  
[Anonymous], 2006, P 12 ACM SIGKDD INT
[3]  
[Anonymous], 2003, International Journal of Artificial Intelligence in Education
[4]  
[Anonymous], 2009, P 18 INT C WORLD WID, DOI [DOI 10.1145/1526709.1526802, 10.1145/1526709.1526802, 10.1145/1526709, DOI 10.1145/1526709]
[5]  
Basu P., 2013, INT C DISTR COMP INT, P126, DOI DOI 10.1007/978-3-642-36071-8_9
[6]  
Bouneffouf D, 2012, LECT NOTES COMPUT SC, V7665, P324, DOI 10.1007/978-3-642-34487-9_40
[8]   Personalized web-based tutoring system based on fuzzy item response theory [J].
Chen, Chih-Ming ;
Duh, Ling-Jiun .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (04) :2298-2315
[9]   Fuzzy Logic for Adaptive Instruction in an E-learning Environment for Computer Programming [J].
Chrysafiadi, Konstantina ;
Virvou, Maria .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2015, 23 (01) :164-177
[10]   A recommender agent based on learning styles for better virtual collaborative learning experiences [J].
Dascalu, Maria-Iuliana ;
Bodea, Constanta-Nicoleta ;
Moldoveanu, Alin ;
Mohora, Anca ;
Lytras, Miltiadis ;
Ordonez de Pablos, Patricia .
COMPUTERS IN HUMAN BEHAVIOR, 2015, 45 :243-253