Sample efficient reinforcement learning with active learning for molecular design

被引:9
|
作者
Dodds, Michael [1 ]
Guo, Jeff [1 ]
Loehr, Thomas [1 ]
Tibo, Alessandro [1 ]
Engkvist, Ola [1 ]
Janet, Jon Paul [1 ]
机构
[1] AstraZeneca, Mol AI, Discovery Sci, R&D, S-43150 Gothenburg, Sweden
关键词
DRUG DISCOVERY; CONFORMER GENERATION; DOCKING; OPTIMIZATION; INFORMATION; CHEMBL;
D O I
10.1039/d3sc04653b
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Reinforcement learning (RL) is a powerful and flexible paradigm for searching for solutions in high-dimensional action spaces. However, bridging the gap between playing computer games with thousands of simulated episodes and solving real scientific problems with complex and involved environments (up to actual laboratory experiments) requires improvements in terms of sample efficiency to make the most of expensive information. The discovery of new drugs is a major commercial application of RL, motivated by the very large nature of the chemical space and the need to perform multiparameter optimization (MPO) across different properties. In silico methods, such as virtual library screening (VS) and de novo molecular generation with RL, show great promise in accelerating this search. However, incorporation of increasingly complex computational models in these workflows requires increasing sample efficiency. Here, we introduce an active learning system linked with an RL model (RL-AL) for molecular design, which aims to improve the sample-efficiency of the optimization process. We identity and characterize unique challenges combining RL and AL, investigate the interplay between the systems, and develop a novel AL approach to solve the MPO problem. Our approach greatly expedites the search for novel solutions relative to baseline-RL for simple ligand- and structure-based oracle functions, with a 5-66-fold increase in hits generated for a fixed oracle budget and a 4-64-fold reduction in computational time to find a specific number of hits. Furthermore, compounds discovered through RL-AL display substantial enrichment of a multi-parameter scoring objective, indicating superior efficacy in curating high-scoring compounds, without a reduction in output diversity. This significant acceleration improves the feasibility of oracle functions that have largely been overlooked in RL due to high computational costs, for example free energy perturbation methods, and in principle is applicable to any RL domain. Active learning accelerates the design of molecules during generative reinforcement learning by creating surrogate models of expensive reward functions, obtaining a 4- to 64-fold reduction in computational effort per hit.
引用
收藏
页码:4146 / 4160
页数:15
相关论文
共 50 条
  • [1] Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning
    Guo, Jeff
    Schwaller, Philippe
    JACS AU, 2024, 4 (06): : 2160 - 2172
  • [2] Towards Sample Efficient Reinforcement Learning
    Yu, Yang
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5739 - 5743
  • [3] Sample Efficient Reinforcement Learning with REINFORCE
    Zhang, Junzi
    Kim, Jongho
    O'Donoghue, Brendan
    Boyd, Stephen
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10887 - 10895
  • [4] Sample Efficient Reinforcement Learning with Gaussian Processes
    Grande, Robert C.
    Walsh, Thomas J.
    How, Jonathan P.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1332 - 1340
  • [5] Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
    Scheller, Christian
    Schraner, Yanick
    Vogel, Manfred
    NEURIPS 2019 COMPETITION AND DEMONSTRATION TRACK, VOL 123, 2019, 123 : 67 - 76
  • [6] A Provably Efficient Sample Collection Strategy for Reinforcement Learning
    Tarbouriech, Jean
    Pirotta, Matteo
    Valko, Michal
    Lazaric, Alessandro
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Sample Efficient Offline-to-Online Reinforcement Learning
    Guo, Siyuan
    Zou, Lixin
    Chen, Hechang
    Qu, Bohao
    Chi, Haotian
    Yu, Philip S.
    Chang, Yi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1299 - 1310
  • [8] Sample Efficient Reinforcement Learning for Navigation in Complex Environments
    Moridian, Barzin
    Page, Brian R.
    Mahmoudian, Nina
    2019 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR), 2019, : 15 - 21
  • [9] Sample Efficient Hierarchical Reinforcement Learning for the Game of Othello
    Chang, Timothy
    Neshatian, Kourosh
    Atlas, James
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 9, ICICT 2024, 2025, 1054 : 419 - 430
  • [10] Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
    Jin, Chi
    Kakade, Sham M.
    Krishnamurthy, Akshay
    Liu, Qinghua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33