Where's the Reward?: A Review of Reinforcement Learning for Instructional Sequencing

被引:40
|
作者
Doroudi, Shayan [1 ,2 ,3 ]
Aleven, Vincent [4 ]
Brunskill, Emma [2 ]
机构
[1] Carnegie Mellon Univ, Comp Sci Dept, Pittsburgh, PA 15213 USA
[2] Stanford Univ, Comp Sci Dept, Stanford, CA 94305 USA
[3] Univ Calif Irvine, Sch Educ, Irvine, CA 92697 USA
[4] Carnegie Mellon Univ, Human Comp Interact Inst, Pittsburgh, PA 15213 USA
关键词
Reinforcement learning; Instructional sequencing; Adaptive instruction; History of artificial intelligence in education; TEACHING STRATEGIES; WORKED EXAMPLES; KNOWLEDGE; ALLOCATION; EXPERTISE; GAME; EFFICIENCY; RETENTION; SELECTION; IMPROVE;
D O I
10.1007/s40593-019-00187-x
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Since the 1960s, researchers have been trying to optimize the sequencing of instructional activities using the tools of reinforcement learning (RL) and sequential decision making under uncertainty. Many researchers have realized that reinforcement learning provides a natural framework for optimal instructional sequencing given a particular model of student learning, and excitement towards this area of research is as alive now as it was over fifty years ago. But does RL actually help students learn? If so, when and where might we expect it to be most helpful? To help answer these questions, we review the variety of attempts to use RL for instructional sequencing. First, we present a historical narrative of this research area. We identify three waves of research, which gives us a sense of the various communities of researchers that have been interested in this problem and where the field is going. Second, we review all of the empirical research that has compared RL-induced instructional policies to baseline methods of sequencing. We find that over half of the studies found that RL-induced policies significantly outperform baselines. Moreover, we identify five clusters of studies with different characteristics and varying levels of success in using RL to help students learn. We find that reinforcement learning has been most successful in cases where it has been constrained with ideas and theories from cognitive psychology and the learning sciences. However, given that our theories and models are limited, we also find that it has been useful to complement this approach with running more robust offline analyses that do not rely heavily on the assumptions of one particular model. Given that many researchers are turning to deep reinforcement learning and big data to tackle instructional sequencing, we believe keeping these best practices in mind can help guide the way to the reward in using RL for instructional sequencing.
引用
收藏
页码:568 / 620
页数:53
相关论文
共 50 条
  • [21] Reinforcement learning and the reward positivity with aversive outcomes
    Bauer, Elizabeth A.
    Watanabe, Brandon K.
    Macnamara, Annmarie
    PSYCHOPHYSIOLOGY, 2024, 61 (04)
  • [22] Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward
    Shao, Kun
    Zhu, Yuanheng
    Tang, Zhentao
    Zhao, Dongbin
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [23] Learning reward machines: A study in partially observable reinforcement learning 
    Icarte, Rodrigo Toro
    Klassen, Toryn Q.
    Valenzano, Richard
    Castro, Margarita P.
    Waldie, Ethan
    Mcilraith, Sheila A.
    ARTIFICIAL INTELLIGENCE, 2023, 323
  • [24] A systematic study of reward for reinforcement learning based continuous integration testing
    Yang, Yang
    Li, Zheng
    He, Liuliu
    Zhao, Ruilian
    JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 170
  • [25] Instructional control of reinforcement learning: A behavioral and neurocomputational investigation
    Doll, Bradley B.
    Jacobs, W. Jake
    Sanfey, Alan G.
    frank, Michael J.
    BRAIN RESEARCH, 2009, 1299 : 74 - 94
  • [26] Reinforcement Learning in Education: A Literature Review
    Mon, Bisni Fahad
    Wasfi, Asma
    Hayajneh, Mohammad
    Slim, Ahmad
    Ali, Najah Abu
    INFORMATICS-BASEL, 2023, 10 (03):
  • [27] Using the ITS Components in Improving the Q-Learning Policy for Instructional Sequencing
    Yessad, Amel
    AUGMENTED INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, ITS 2023, 2023, 13891 : 247 - 256
  • [28] Curriculum Generation and Sequencing for Deep Reinforcement Learning in StarCraft II
    Hao, Daniel
    Sweetser, Penny
    Aitchison, Matthew
    2022 AUSTRALIAN COMPUTER SCIENCE WEEK (ACSW 2022), 2022, : 1 - 11
  • [29] Maximizing the average reward in episodic reinforcement learning tasks
    Reinke, Chris
    Uchibe, Eiji
    Doya, Kenji
    2015 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS), 2015, : 420 - 421
  • [30] A review on modeling tumor dynamics and agent reward functions in reinforcement learning based therapy optimization
    Almasy, Marton Gyorgy
    Horompo, Andras
    Kiss, Daniel
    Kertesz, Gabor
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (06) : 6939 - 6946