Simultaneously Evolving Deep Reinforcement Learning Models using Multifactorial Optimization

被引：7

作者：

Martinez, Aritz D. ^{[1
]}

Osaba, Eneko ^{[1
]}

Del Ser, Javier ^{[1
,2
]}

Herrera, Francisco ^{[3
]}

机构：

[1] TECNALIA, Basque Res & Technol Alliance BRTA, Derio 48160, Bizkaia, Spain

[2] Univ Basque Country, Bilbao 48013, Bizkaia, Spain

[3] Univ Granada, DaSCI Andalusian Inst Data Sci & Computat Intelli, Granada 18071, Spain

来源：

2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2020年

关键词：

Multifactorial Optimization; Deep Reinforcement Learning; Transfer Learning; Evolutionary Algorithm;

D O I：

10.1109/cec48606.2020.9185667

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, Multifactorial Optimization (MFO) has gained a notable momentum in the research community. MFO is known for its inherent capability to efficiently address multiple optimization tasks at the same time, while transferring information among such tasks to improve their convergence speed. On the other hand, the quantum leap made by Deep Q Learning (DQL) in the Machine Learning field has allowed facing Reinforcement Learning (RL) problems of unprecedented complexity. Unfortunately, complex DQL models usually find it difficult to converge to optimal policies due to the lack of exploration or sparse rewards. In order to overcome these drawbacks, pre-trained models are widely harnessed via Transfer Learning, extrapolating knowledge acquired in a source task to the target task. Besides, meta-heuristic optimization has been shown to reduce the lack of exploration of DQL models. This work proposes a MFO framework capable of simultaneously evolving several DQL models towards solving interrelated RL tasks. Specifically, our proposed framework blends together the benefits of meta-heuristic optimization, Transfer Learning and DQL to automate the process of knowledge transfer and policy learning of distributed RL agents. A thorough experimentation is presented and discussed so as to assess the performance of the framework, its comparison to the traditional methodology for Transfer Learning in terms of convergence, speed and policy quality, and the intertask relationships found and exploited over the search process.

引用

页数：8

共 23 条

[1] Deep Reinforcement Learning A brief survey
Arulkumaran, Kai
Deisenroth, Marc Peter
Brundage, Miles
Bharath, Anil Anthony
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) : 26 - 38
[2] Baker B, 2019, INT C LEARN REPR, P1
[3] Bean J. C., 1994, ORSA Journal on Computing, V6, P154, DOI 10.1287/ijoc.6.2.154
[4] Brockman Greg, 2016, arXiv
[5] Conti E, 2018, ADV NEUR IN, V31
[6] Da B, 2017, EVOLUTIONARY MULTITA
[7] Memes as building blocks: a case study on evolutionary optimization plus transfer learning for routing problems
Feng, Liang
Ong, Yew-Soon
Tan, Ah-Hwee
Tsang, Ivor W.
[J]. MEMETIC COMPUTING, 2015, 7 (03) : 159 - 180
[8] Glatt R, 2016, PROCEEDINGS OF 2016 5TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2016), P91, DOI [10.1109/BRACIS.2016.027, 10.1109/BRACIS.2016.17]
[9] Insights on Transfer Optimization: Because Experience is the Best Teacher
Gupta, Abhishek
Ong, Yew-Soon
Feng, Liang
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2018, 2 (01): : 51 - 64
[10] Multifactorial Evolution: Toward Evolutionary Multitasking
Gupta, Abhishek
Ong, Yew-Soon
Feng, Liang
[J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2016, 20 (03) : 343 - 357

← 1 2 3 →