Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design

被引:46
作者
Nigam, AkshatKumar [1 ,2 ,3 ]
Pollice, Robert [2 ,3 ]
Aspuru-Guzik, Alan [2 ,3 ,4 ,5 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA USA
[2] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
[3] Univ Toronto, Dept Chem, Toronto, ON, Canada
[4] Vector Inst Artificial Intelligence, Toronto, ON, Canada
[5] Canadian Inst Adv Res CIFAR, 661 Univ Ave, Toronto, ON M5G 1M1, Canada
来源
DIGITAL DISCOVERY | 2022年 / 1卷 / 04期
基金
瑞士国家科学基金会;
关键词
NOVO DRUG DESIGN; MULTIOBJECTIVE OPTIMIZATION; AUTOMATED GENERATION; SYSTEM; SMILES; CHEMBL;
D O I
10.1039/d2dd00003b
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Inverse molecular design involves algorithms that sample molecules with specific target properties from a multitude of candidates and can be posed as an optimization problem. High-dimensional optimization tasks in the natural sciences are commonly tackled via population-based metaheuristic optimization algorithms such as evolutionary algorithms. However, often unavoidable expensive property evaluation can limit the widespread use of such approaches as the associated cost can become prohibitive. Herein, we present JANUS, a genetic algorithm inspired by parallel tempering. It propagates two populations, one for exploration and another for exploitation, improving optimization by reducing property evaluations. JANUS is augmented by a deep neural network that approximates molecular properties and relies on active learning for enhanced molecular sampling. It uses the SELFIES representation and the STONED algorithm for the efficient generation of structures, and outperforms other generative models in common inverse molecular design tasks achieving state-of-the-art target metrics across multiple benchmarks. As neither most of the benchmarks nor the structure generator in JANUS account for synthesizability, a significant fraction of the proposed molecules is synthetically infeasible demonstrating that this aspect needs to be considered when evaluating the performance of molecular generative models. We present JANUS, an evolutionary algorithm for inverse molecular design. It propagates an explorative and an exploitative population exchanging members via parallel tempering and uses active learning via deep neural networks to enhance sampling.
引用
收藏
页码:390 / 404
页数:15
相关论文
共 89 条
[1]  
Ahn Sungsoo, 2020, Advances in Neural Information Processing Systems, V33
[2]  
Baluja S., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P38
[3]  
Bickerton GR, 2012, NAT CHEM, V4, P90, DOI [10.1038/NCHEM.1243, 10.1038/nchem.1243]
[4]   Metaheuristics in combinatorial optimization: Overview and conceptual comparison [J].
Blum, C ;
Roli, A .
ACM COMPUTING SURVEYS, 2003, 35 (03) :268-308
[5]   A graph-based genetic algorithm and its application to the multiobjective evolution of median molecules [J].
Brown, N ;
McKay, B ;
Gilardoni, F ;
Gasteiger, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :1079-1087
[6]   GuacaMol: Benchmarking Models for de Novo Molecular Design [J].
Brown, Nathan ;
Fiscato, Marco ;
Segler, Marwin H. S. ;
Vaucher, Alain C. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (03) :1096-1108
[7]   V-Dock: Fast Generation of Novel Drug-like Molecules Using Machine-Learning-Based Docking Score and Molecular Optimization [J].
Choi, Jieun ;
Lee, Juyong .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (21)
[8]  
Cieplinski T, 2021, Arxiv, DOI arXiv:2006.16955
[9]   SCScore: Synthetic Complexity Learned from a Reaction Corpus [J].
Coley, Connor W. ;
Rogers, Luke ;
Green, William H. ;
Jensen, Klavs F. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (02) :252-261
[10]  
Dai H., 2018, Proceedings of the International Conference on Learning Representations