Knowledge graph-enhanced molecular contrastive learning with functional prompt

被引:92
作者
Fang, Yin [1 ,2 ,3 ]
Zhang, Qiang [1 ,2 ]
Zhang, Ningyu [1 ,4 ,5 ]
Chen, Zhuo [1 ,2 ]
Zhuang, Xiang [1 ,2 ]
Shao, Xin [3 ]
Fan, Xiaohui [3 ,6 ,7 ]
Chen, Huajun [1 ,2 ,5 ,8 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] ZJU, Hangzhou Global Sci & Technol Innovat Ctr, Hangzhou, Peoples R China
[3] Zhejiang Univ, Coll Pharmaceut Sci, Hangzhou, Peoples R China
[4] Zhejiang Univ, Sch Software Technol, Ningbo, Peoples R China
[5] Alibaba ZJU Frontier Technol Res Ctr, Hangzhou, Peoples R China
[6] Zhejiang Univ, Innovat Ctr Yangtze River Delta, Future Hlth Lab, Jiaxing, Peoples R China
[7] Natl Key Lab Modern Chinese Med Innovat & Mfg, Hangzhou, Peoples R China
[8] Donghai Lab, Zhoushan, Peoples R China
基金
中国国家自然科学基金;
关键词
DATABASE;
D O I
10.1038/s42256-023-00654-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning models can accurately predict molecular properties and help making the search for potential drug candidates faster and more efficient. Many existing methods are purely data driven, focusing on exploiting the intrinsic topology and construction rules of molecules without any chemical prior information. The high data dependency makes them difficult to generalize to a wider chemical space and leads to a lack of interpretability of predictions. Here, to address this issue, we introduce a chemical element-oriented knowledge graph to summarize the basic knowledge of elements and their closely related functional groups. We further propose a method for knowledge graph-enhanced molecular contrastive learning with functional prompt (KANO), exploiting external fundamental domain knowledge in both pre-training and fine-tuning. Specifically, with element-oriented knowledge graph as a prior, we first design an element-guided graph augmentation in contrastive-based pre-training to explore microscopic atomic associations without violating molecular semantics. Then, we learn functional prompts in fine-tuning to evoke the downstream task-related knowledge acquired by the pre-trained model. Extensive experiments show that KANO outperforms state-of-the-art baselines on 14 molecular property prediction datasets and provides chemically sound explanations for its predictions. This work contributes to more efficient drug design by offering a high-quality knowledge prior, interpretable molecular representation and superior prediction performance. Deep learning can be used to predict molecular properties, but such methods usually need a large amount of data and are hard to generalize to different chemical spaces. To provide a useful primer for deep learning models models, Fang and colleagues use contrastive learning and a knowledge graph based on the Periodic Table and Wikipedia pages on chemical functional groups.
引用
收藏
页码:542 / +
页数:14
相关论文
共 51 条
[51]  
Zhang ZX, 2021, ADV NEUR IN, V34