Leveraging language model for advanced multiproperty molecular optimization via prompt engineering

被引:10
作者
Wu, Zhenxing [1 ,2 ]
Zhang, Odin [1 ,2 ]
Wang, Xiaorui [3 ]
Fu, Li [4 ]
Zhao, Huifeng [1 ,2 ]
Wang, Jike [1 ,2 ]
Du, Hongyan [1 ]
Jiang, Dejun [1 ,2 ]
Deng, Yafeng [2 ]
Cao, Dongsheng [4 ]
Hsieh, Chang-Yu [1 ]
Hou, Tingjun [1 ]
机构
[1] Zhejiang Univ, Coll Pharmaceut Sci, Innovat Inst Artificial Intelligence Med, Hangzhou, Peoples R China
[2] CarbonSilicon AI Technol Co Ltd, Hangzhou, Peoples R China
[3] Macau Univ Sci & Technol, Macau Inst Appl Res Med & Hlth, Dr Nehers Biophys Lab Innovat Drug Discovery, State Key Lab Qual Res Chinese Med, Macau, Peoples R China
[4] Cent South Univ, Xiangya Sch Pharmaceut Sci, Changsha, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
PREDICTION; DISCOVERY; EFFICIENT; ACCURATE;
D O I
10.1038/s42256-024-00916-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Optimizing a candidate molecule's physiochemical and functional properties has been a critical task in drug and material design. Although the non-trivial task of balancing multiple (potentially conflicting) optimization objectives is considered ideal for artificial intelligence, several technical challenges such as the scarcity of multiproperty-labelled training data have hindered the development of a satisfactory AI solution for a long time. Prompt-MolOpt is a tool for molecular optimization; it makes use of prompt-based embeddings, as used in large language models, to improve the transformer's ability to optimize molecules for specific property adjustments. Notably, Prompt-MolOpt excels in working with limited multiproperty data (even under the zero-shot setting) by effectively generalizing causal relationships learned from single-property datasets. In comparative evaluations against established models such as JTNN, hierG2G and Modof, Prompt-MolOpt achieves over a 15% relative improvement in multiproperty optimization success rates compared with the leading Modof model. Furthermore, a variant of Prompt-MolOpt, named Prompt-MolOptP, can preserve the pharmacophores or any user-specified fragments under the structural transformation, further broadening its application scope. By constructing tailored optimization datasets, with the protocol introduced in this work, Prompt-MolOpt steers molecular optimization towards domain-relevant chemical spaces, enhancing the quality of the optimized molecules. Real-world tests, such as those involving blood-brain barrier permeability optimization, underscore its practical relevance. Prompt-MolOpt offers a versatile approach for multiproperty and multi-site molecular optimizations, suggesting its potential utility in chemistry research and drug and material discovery. Designing molecules in drug design is challenging as it requires optimizing multiple, potentially competing qualities. Wu and colleagues present a prompt-based molecule optimization method that can be trained from single-property data.
引用
收藏
页码:1359 / 1369
页数:15
相关论文
共 46 条
[41]   ADMETopt: A Web Server for ADMET Optimization in Drug Design via Scaffold Hopping [J].
Yang, Hongbin ;
Sun, Lixia ;
Wang, Zhuang ;
Li, Weihua ;
Liu, Guixia ;
Tang, Yun .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (10) :2051-2056
[42]  
Ye G., 2023, PREPRINT
[43]   MoFlow: An Invertible Flow Model for Generating Molecular Graphs [J].
Zang, Chengxi ;
Wang, Fei .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :617-626
[44]   CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose [J].
Zhang, Xu ;
Wang, Wen ;
Chen, Zhe ;
Xu, Yufei ;
Zhang, Jing ;
Tao, Dacheng .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :23272-23281
[45]   Efficient and accurate large library ligand docking with KarmaDock [J].
Zhang, Xujun ;
Zhang, Odin ;
Shen, Chao ;
Qu, Wanglin ;
Chen, Shicheng ;
Cao, Hanqun ;
Kang, Yu ;
Wang, Zhe ;
Wang, Ercheng ;
Zhang, Jintu ;
Deng, Yafeng ;
Liu, Furui ;
Wang, Tianyue ;
Du, Hongyan ;
Wang, Langcheng ;
Pan, Peichen ;
Chen, Guangyong ;
Hsieh, Chang-Yu ;
Hou, Tingjun .
NATURE COMPUTATIONAL SCIENCE, 2023, 3 (09) :789-804
[46]   Conditional Prompt Learning for Vision-Language Models [J].
Zhou, Kaiyang ;
Yang, Jingkang ;
Loy, Chen Change ;
Liu, Ziwei .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :16795-16804