Reinforcement Learning Algorithms: An Overview and Classification

被引：42

作者：

AlMahamid, Fadi ^{[1
]}

Grolinger, Katarina ^{[1
]}

机构：

[1] Western Univ, Dept Elect & Comp Engn, London, ON, Canada

来源：

2021 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE) | 2021年

关键词：

D O I：

10.1109/CCECE53047.2021.9569056

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The desire to make applications and machines more intelligent and the aspiration to enable their operation without human interaction have been driving innovations in neural networks, deep learning, and other machine learning techniques. Although reinforcement learning has been primarily used in video games, recent advancements and the development of diverse and powerful reinforcement algorithms have enabled the reinforcement learning community to move from playing video games to solving complex real-life problems in autonomous systems such as self-driving cars, delivery drones, and automated robotics. Understanding the environment of an application and the algorithms' limitations plays a vital role in selecting the appropriate reinforcement learning algorithm that successfully solves the problem on hand in an efficient manner. Consequently, in this study, we identify three main environment types and classify reinforcement learning algorithms according to those environment types. Moreover, within each category, we identify relationships between algorithms. The overview of each algorithm provides insight into the algorithms' foundations and reviews similarities and differences among algorithms. This study provides a perspective on the field and helps practitioners and researchers to select the appropriate algorithm for their use case.

引用

页数：7

共 35 条

[1] Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes Using Transfer Learning [J].

Anwar, Aqeel ;

Raychowdhury, Arijit .

IEEE ACCESS, 2020, 8 :26549-26560

[2]

Barth-Maron G, 2018, Arxiv, DOI [arXiv:1804.08617, DOI 10.48550/ARXIV.1804.08617]

[3]

Chishti SOA, 2018, 2018 IEEE 21ST INTERNATIONAL MULTI-TOPIC CONFERENCE (INMIC)

[4]

Cobbe K., 2020, arXiv

[5]

Espeholt L, 2018, PR MACH LEARN RES, V80

[6]

Fujimoto S, 2018, PR MACH LEARN RES, V80

[7]

Haarnoja T, 2018, PR MACH LEARN RES, V80

[8]

Hasselt H.V., 2010, P ADV NEUR INF PROC

[9]

Hausknecht M., 2015, arXiv

[10] On actor-critic algorithms [J].

Konda, VR ;

Tsitsiklis, JN .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) :1143-1166

← 1 2 3 4 →