De Novo Drug Design by Multi-Objective Path Consistency Learning With Beam A* Search

被引：0

作者：

Zhao, Dengwei ^{[1
]}

Zhou, Jingyuan ^{[1
]}

Tu, Shikui ^{[1
]}

Xu, Lei ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[2] Guangdong Inst Intelligence Sci & Technol, Zhuhai 519031, Guangdong, Peoples R China

来源：

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS | 2024年 / 21卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Beam search; multi-objective optimization; path consistency; q learning; de novo drug design; GENERATION; ALGORITHM; SMILES; SHOGI; CHESS; GO;

D O I：

10.1109/TCBB.2024.3477592

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Generating high-quality and drug-like molecules from scratch within the expansive chemical space presents a significant challenge in the field of drug discovery. In prior research, value-based reinforcement learning algorithms have been employed to generate molecules with multiple desired properties iteratively. The immediate reward was defined as the evaluation of intermediate-state molecules at each step, and the learning objective would be maximizing the expected cumulative evaluation scores for all molecules along the generative path. However, this definition of the reward was misleading, as in reality, the optimization target should be the evaluation score of only the final generated molecule. Furthermore, in previous works, randomness was introduced into the decision-making process, enabling the generation of diverse molecules but no longer pursuing the maximum future rewards. In this paper, immediate reward is defined as the improvement achieved through the modification of the molecule to maximize the evaluation score of the final generated molecule exclusively. Originating from the A* search, path consistency (PC), i.e., f values on one optimal path should be identical, is employed as the objective function in the update of the f value estimator to train a multi-objective de novo drug designer. By incorporating the f value into the decision-making process of beam search, the DrugBA* algorithm is proposed to enable the large-scale generation of molecules that exhibit both high quality and diversity. Experimental results demonstrate a substantial enhancement over the state-of-the-art algorithm QADD in multiple molecular properties of the generated molecules.

引用

页码：2459 / 2470

页数：12

共 50 条

[31] Multi-objective Reinforcement Learning with Path Integral Policy Improvement
Ariizumi, Ryo
Sago, Hayato
Asai, Toru
Azuma, Shun-ichi
2023 62ND ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS, SICE, 2023, : 1418 - 1423
[32] Multi-objective path planning based on deep reinforcement learning
Xu, Jian
Huang, Fei
Cui, Yunfei
Du, Xue
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3273 - 3279
[33] Multi-objective optimization methods in novel drug design
Lambrinidis, George
Tsantili-Kakoulidou, Anna
EXPERT OPINION ON DRUG DISCOVERY, 2021, 16 (06) : 647 - 658
[34] Multi-objective Conflict-based Search for Multi-agent Path Finding
Ren, Zhongqiang
Rathinam, Sivakumar
Choset, Howie
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 8786 - 8791
[35] MOCSA: A Multi-Objective Crow Search Algorithm for Multi-Objective Optimization
Nobahari, Hadi
Bighashdel, Ariyan
2017 2ND CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC), 2017, : 60 - 65
[36] Utilizing reinforcement learning for de novo drug design
Gummesson Svensson, Hampus
Tyrchan, Christian
Engkvist, Ola
Haghir Chehreghani, Morteza
MACHINE LEARNING, 2024, 113 (07) : 4811 - 4843
[37] Deep reinforcement learning for de novo drug design
Popova, Mariya
Isayev, Olexandr
Tropsha, Alexander
SCIENCE ADVANCES, 2018, 4 (07):
[38] Neuroevolutionary diversity policy search for multi-objective reinforcement learning
Zhou, Dan
Du, Jiqing
Arai, Sachiyo
INFORMATION SCIENCES, 2024, 657
[39] Multi-Objective Search Group Algorithm for engineering design problems
Huy, Truong Hoang Bao
Nallagownden, Perumal
Truong, Khoa Hoang
Kannan, Ramani
Vo, Dieu Ngoc
Ho, Nguyen
APPLIED SOFT COMPUTING, 2022, 126
[40] Multi-Objective Graph Heuristic Search for Terrestrial Robot Design
Xu, Jie
Spielberg, Andrew
Zhao, Allan
Rus, Daniela
Matusik, Wojciech
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 9863 - 9869

← 1 2 3 4 5 →