A novel multi-step reinforcement learning method for solving reward hacking

被引：12

作者：

Yuan, Yinlong ^{[1
]}

Yu, Zhu Liang ^{[1
]}

Gu, Zhenghui ^{[1
]}

Deng, Xiaoyan ^{[1
]}

Li, Yuanqing ^{[1
]}

机构：

[1] South China Univ Technol, Coll Automat Sci & Engn, Guangzhou 510641, Guangdong, Peoples R China

来源：

APPLIED INTELLIGENCE | 2019年 / 49卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Robotics; Reward hacking; Multi-step methods; CONVERGENCE; ALGORITHMS;

D O I：

10.1007/s10489-019-01417-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning with appropriately designed reward signal could be used to solve many sequential learning problems. However, in practice, the reinforcement learning algorithms could be broken in unexpected, counterintuitive ways. One of the failure modes is reward hacking which usually happens when a reward function makes the agent obtain high return in an unexpected way. This unexpected way may subvert the designer's intentions and lead to accidents during training. In this paper, a new multi-step state-action value algorithm is proposed to solve the problem of reward hacking. Unlike traditional algorithms, the proposed method uses a new return function, which alters the discount of future rewards and no longer stresses the immediate reward as the main influence when selecting the current state action. The performance of the proposed method is evaluated on two games, Mappy and Mountain Car. The empirical results demonstrate that the proposed method can alleviate the negative impact of reward hacking and greatly improve the performance of reinforcement learning algorithm. Moreover, the results illustrate that the proposed method could also be applied to the continuous state space problem successfully.

引用

页码：2874 / 2888

页数：15

共 37 条

[1]

Amin Kareem, 2017, ADV NEURAL INFORM PR, VI, P1815

[2]

Amodei Dario, 2016, arXiv

[3] Discrete space reinforcement learning algorithm based on support vector machine classification [J].

An, Yuexuan ;

Ding, Shifei ;

Shi, Songhui ;

Li, Jingcan .

PATTERN RECOGNITION LETTERS, 2018, 111 :30-35

[4]

[Anonymous], 2017, ARXIV171109883

[5]

[Anonymous], P 8 INT C EXTR LEARN

[6] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[7]

Aslund H, 2018, ARXIV180511447

[8]

Bragg J, 2018, INT WORKSH ART INT S

[9]

De Asis K, 2017, ARXIV170301327V1

[10] Reinforcement learning in continuous time and space [J].

Doya, K .

NEURAL COMPUTATION, 2000, 12 (01) :219-245

← 1 2 3 4 →