De Novo Drug Design by Multi-Objective Path Consistency Learning With Beam A* Search

被引：0

作者：

Zhao, Dengwei ^{[1
]}

Zhou, Jingyuan ^{[1
]}

Tu, Shikui ^{[1
]}

Xu, Lei ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[2] Guangdong Inst Intelligence Sci & Technol, Zhuhai 519031, Guangdong, Peoples R China

来源：

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS | 2024年 / 21卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Beam search; multi-objective optimization; path consistency; q learning; de novo drug design; GENERATION; ALGORITHM; SMILES; SHOGI; CHESS; GO;

D O I：

10.1109/TCBB.2024.3477592

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Generating high-quality and drug-like molecules from scratch within the expansive chemical space presents a significant challenge in the field of drug discovery. In prior research, value-based reinforcement learning algorithms have been employed to generate molecules with multiple desired properties iteratively. The immediate reward was defined as the evaluation of intermediate-state molecules at each step, and the learning objective would be maximizing the expected cumulative evaluation scores for all molecules along the generative path. However, this definition of the reward was misleading, as in reality, the optimization target should be the evaluation score of only the final generated molecule. Furthermore, in previous works, randomness was introduced into the decision-making process, enabling the generation of diverse molecules but no longer pursuing the maximum future rewards. In this paper, immediate reward is defined as the improvement achieved through the modification of the molecule to maximize the evaluation score of the final generated molecule exclusively. Originating from the A* search, path consistency (PC), i.e., f values on one optimal path should be identical, is employed as the objective function in the update of the f value estimator to train a multi-objective de novo drug designer. By incorporating the f value into the decision-making process of beam search, the DrugBA* algorithm is proposed to enable the large-scale generation of molecules that exhibit both high quality and diversity. Experimental results demonstrate a substantial enhancement over the state-of-the-art algorithm QADD in multiple molecular properties of the generated molecules.

引用

页码：2459 / 2470

页数：12

共 50 条

[1] Multi-Objective Optimization Methods in De Novo Drug Design
Nicolaou, C. A.
Kannas, C.
Loizidou, E.
MINI-REVIEWS IN MEDICINAL CHEMISTRY, 2012, 12 (10) : 979 - 987
[2] Multi-objective Genetic Algorithm for De Novo Drug Design (MoGADdrug)
Devi, R. Vasundhara
Sathya, S. Siva
Coumar, Mohane S.
CURRENT COMPUTER-AIDED DRUG DESIGN, 2021, 17 (03) : 445 - 457
[3] Multi-objective de novo drug design with conditional graph generative model
Li, Yibo
Zhang, Liangren
Liu, Zhenming
JOURNAL OF CHEMINFORMATICS, 2018, 10
[4] Multi-objective de novo drug design with conditional graph generative model
Yibo Li
Liangren Zhang
Zhenming Liu
Journal of Cheminformatics, 10
[5] Multi-Objective Molecular De Novo Design by Adaptive Fragment Prioritization
Reutlinger, Michael
Rodrigues, Tiago
Schneider, Petra
Schneider, Gisbert
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2014, 53 (16) : 4244 - 4248
[6] De novo drug design based on Stack-RNN with multi-objective reward-weighted sum and reinforcement learning
Pengwei Hu
Jinping Zou
Jialin Yu
Shaoping Shi
Journal of Molecular Modeling, 2023, 29
[7] Multi-objective de novo adrug design using evolutionary graphs
Christos A Nicolaou
CS Pattichis
Chemistry Central Journal, 2 (Suppl 1)
[8] De novo drug design based on Stack-RNN with multi-objective reward-weighted sum and reinforcement learning
Hu, Pengwei
Zou, Jinping
Yu, Jialin
Shi, Shaoping
JOURNAL OF MOLECULAR MODELING, 2023, 29 (04)
[9] DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology
Xuhan Liu
Kai Ye
Herman W. T. van Vlijmen
Michael T. M. Emmerich
Adriaan P. IJzerman
Gerard J. P. van Westen
Journal of Cheminformatics, 13
[10] DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology
Liu, Xuhan
Ye, Kai
van Vlijmen, Herman W. T.
Emmerich, Michael T. M.
IJzerman, Adriaan P.
van Westen, Gerard J. P.
JOURNAL OF CHEMINFORMATICS, 2021, 13 (01)

← 1 2 3 4 5 →