Sample efficient reinforcement learning with active learning for molecular design

被引：9

作者：

Dodds, Michael ^{[1
]}

Guo, Jeff ^{[1
]}

Loehr, Thomas ^{[1
]}

Tibo, Alessandro ^{[1
]}

Engkvist, Ola ^{[1
]}

Janet, Jon Paul ^{[1
]}

机构：

[1] AstraZeneca, Mol AI, Discovery Sci, R&D, S-43150 Gothenburg, Sweden

来源：

CHEMICAL SCIENCE | 2024年 / 15卷 / 11期

关键词：

DRUG DISCOVERY; CONFORMER GENERATION; DOCKING; OPTIMIZATION; INFORMATION; CHEMBL;

D O I：

10.1039/d3sc04653b

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Reinforcement learning (RL) is a powerful and flexible paradigm for searching for solutions in high-dimensional action spaces. However, bridging the gap between playing computer games with thousands of simulated episodes and solving real scientific problems with complex and involved environments (up to actual laboratory experiments) requires improvements in terms of sample efficiency to make the most of expensive information. The discovery of new drugs is a major commercial application of RL, motivated by the very large nature of the chemical space and the need to perform multiparameter optimization (MPO) across different properties. In silico methods, such as virtual library screening (VS) and de novo molecular generation with RL, show great promise in accelerating this search. However, incorporation of increasingly complex computational models in these workflows requires increasing sample efficiency. Here, we introduce an active learning system linked with an RL model (RL-AL) for molecular design, which aims to improve the sample-efficiency of the optimization process. We identity and characterize unique challenges combining RL and AL, investigate the interplay between the systems, and develop a novel AL approach to solve the MPO problem. Our approach greatly expedites the search for novel solutions relative to baseline-RL for simple ligand- and structure-based oracle functions, with a 5-66-fold increase in hits generated for a fixed oracle budget and a 4-64-fold reduction in computational time to find a specific number of hits. Furthermore, compounds discovered through RL-AL display substantial enrichment of a multi-parameter scoring objective, indicating superior efficacy in curating high-scoring compounds, without a reduction in output diversity. This significant acceleration improves the feasibility of oracle functions that have largely been overlooked in RL due to high computational costs, for example free energy perturbation methods, and in principle is applicable to any RL domain. Active learning accelerates the design of molecules during generative reinforcement learning by creating surrogate models of expensive reward functions, obtaining a 4- to 64-fold reduction in computational effort per hit.

引用

页码：4146 / 4160

页数：15

共 50 条

[1] Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning
Guo, Jeff
Schwaller, Philippe
JACS AU, 2024, 4 (06): : 2160 - 2172
[2] Towards Sample Efficient Reinforcement Learning
Yu, Yang
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5739 - 5743
[3] Sample Efficient Reinforcement Learning with REINFORCE
Zhang, Junzi
Kim, Jongho
O'Donoghue, Brendan
Boyd, Stephen
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10887 - 10895
[4] Sample Efficient Reinforcement Learning with Gaussian Processes
Grande, Robert C.
Walsh, Thomas J.
How, Jonathan P.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1332 - 1340
[5] Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
Scheller, Christian
Schraner, Yanick
Vogel, Manfred
NEURIPS 2019 COMPETITION AND DEMONSTRATION TRACK, VOL 123, 2019, 123 : 67 - 76
[6] A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Tarbouriech, Jean
Pirotta, Matteo
Valko, Michal
Lazaric, Alessandro
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[7] Sample Efficient Offline-to-Online Reinforcement Learning
Guo, Siyuan
Zou, Lixin
Chen, Hechang
Qu, Bohao
Chi, Haotian
Yu, Philip S.
Chang, Yi
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1299 - 1310
[8] Sample Efficient Reinforcement Learning for Navigation in Complex Environments
Moridian, Barzin
Page, Brian R.
Mahmoudian, Nina
2019 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR), 2019, : 15 - 21
[9] Sample Efficient Hierarchical Reinforcement Learning for the Game of Othello
Chang, Timothy
Neshatian, Kourosh
Atlas, James
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 9, ICICT 2024, 2025, 1054 : 419 - 430
[10] Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Jin, Chi
Kakade, Sham M.
Krishnamurthy, Akshay
Liu, Qinghua
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33

← 1 2 3 4 5 →