Improving Reinforcement Learning Performance through a Behavioral Psychology-Inspired Variable Reward Scheme

被引：0

作者：

Rathore, Heena ^{[1
]}

Griffith, Henry ^{[2
]}

机构：

[1] Texas State Univ, Dept Comp Sci, San Marcos, TX 78666 USA

[2] San Antonio Coll, Dept Engn, San Antonio, TX USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING, SMARTCOMP | 2023年

关键词：

variable reward; reinforcement learning; psychology; q-learning;

D O I：

10.1109/SMARTCOMP58114.2023.00050

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) algorithms employ a fixed-ratio schedule which can lead to overfitting, where the agent learns to optimize for the specific rewards it receives, rather than learning the underlying task. Further, the agent can simply repeat the same actions that have worked in the past and do not explore different actions and strategies to see what works best. This leads to generalization issue, where the agent struggles to apply what it has learned to new, unseen situations. This can be particularly problematic in complex environments where the agent needs to learn to generalize from limited data. Introducing variable reward schedules in RL inspired from behavioral psychology can be more effective than traditional reward schemes because they can mimic real-world environments where rewards are not always consistent or predictable. This can also encourage an RL agent to explore more and become more adaptable to changes in the environment. The simulation results showed that variable reward scheme has faster learning rate as compared to fixed rewards.

引用

页码：210 / 212

页数：3

共 45 条

[1] Improving Batch Reinforcement Learning Performance through Transfer of Samples
Lazaric, Alessandro
Restelli, Marcello
Bonarini, Andrea
STAIRS 2008, 2008, 179 : 106 - 117
[2] Improving the Performance of Autonomous Driving through Deep Reinforcement Learning
Tammewar, Akshaj
Chaudhari, Nikita
Saini, Bunny
Venkatesh, Divya
Dharahas, Ganpathiraju
Vora, Deepali
Patil, Shruti
Kotecha, Ketan
Alfarhood, Sultan
SUSTAINABILITY, 2023, 15 (18)
[3] Reward-based participant selection for improving federated reinforcement learning
Lee, Woonghee
ICT EXPRESS, 2023, 9 (05): : 803 - 808
[4] Enhancing Reinforcement Learning Performance in Delayed Reward System Using DQN and Heuristics
Kim, Keecheon
IEEE ACCESS, 2022, 10 : 50641 - 50650
[5] Probing relationships between reinforcement learning and simple behavioral strategies to understand probabilistic reward learning
Iyer, Eshaan S.
Kairiss, Megan A.
Liu, Adrian
Otto, A. Ross
Bagot, Rosemary C.
JOURNAL OF NEUROSCIENCE METHODS, 2020, 341
[6] Reinforcement Learning With Constrained Uncertain Reward Function Through Particle Filtering
Dogru, Oguzhan
Chiplunkar, Ranjith
Huang, Biao
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2022, 69 (07) : 7491 - 7499
[7] Gantry Work Cell Scheduling through Reinforcement Learning with Knowledge-guided Reward Setting
Ou, Xinyan
Chang, Qing
Arinez, Jorge
Zou, Jing
IEEE ACCESS, 2018, 6 : 14699 - 14709
[8] Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance
Knox, W. Bradley
Stone, Peter
ARTIFICIAL INTELLIGENCE, 2015, 225 : 24 - 50
[9] Deep reinforcement learning for improving competitive cycling performance
Demosthenous, Giorgos
Kyriakou, Marios
Vassiliades, Vassilis
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
[10] Exemplar Generalization in Reinforcement Learning: Improving Performance with Fewer Exemplars
Matsushima, Hiroyasu
Hattori, Kiyohiko
Takadama, Keiki
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2009, 13 (06) : 683 - 690

← 1 2 3 4 5 →