A Chemical Domain Knowledge-Aware Framework for Multi-view Molecular Property Prediction

被引:2
作者
Hua, Rui [1 ]
Wang, Xinyan [1 ]
Cheng, Chuang [1 ]
Zhu, Qiang [1 ]
Zhou, Xuezhong [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing 100044, Peoples R China
来源
CCKS 2022 - EVALUATION TRACK | 2022年 / 1711卷
关键词
Molecular property prediction; Chemical domain knowledge; Molecular representation;
D O I
10.1007/978-981-19-8300-9_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Molecular property prediction is becoming increasingly important in drug and material discovery, and many research works have demonstrated the great potential of machine learning techniques, especially deep learning. This paper presents our proposed solution for CCKS-2022 task 8, a chemical domain knowledge-aware framework for multi-view molecular property prediction. As a generative self-supervised approach to molecular graph representation learning, the framework is based on Knowledge-guided Pre-training of Graph Transformer (KPGT), which adopts a graph transformer guided by molecular fingerprint and descriptor knowledge. In the fine-tuning stage, combined with practical prediction problems, we fuse functional group information and chemical element knowledge graphs to predict molecular properties. From the perspective of chemical structure, KPGT provides structural information of molecular graphs (especially highlighting chemical bonds), and we further integrate chemical domain knowledge, using functional groups and chemical element knowledge graph, which is the information on physicochemical properties of atoms. From molecular graphs to functional groups, and to atoms, the molecular representation is jointly enhanced by multiple views from coarse to fine. When introducing functional group information and chemical element knowledge graph, we propose a novel BiLSTM-based recurrent module to accumulate domain knowledge. Our framework is able to simultaneously consider molecular graph, functional groups, and atomic physicochemical properties in practical predictions to better predict molecular properties. Finally, without using other external knowledge, the AUC-ROC of the test data reaches 0.88587, ranking second among 140 teams, which validates the performance of our approach.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 21 条
  • [1] Polymer functional group impact on the thermo-mechanical properties of polyacrylic acid, polyacrylic amide- poly (vinyl alcohol) nanocomposites reinforced by graphene oxide nanosheets
    Al-shammari, Athmar K.
    Al-Bermany, Ehssan
    [J]. JOURNAL OF POLYMER RESEARCH, 2022, 29 (08)
  • [2] Reconciling modern machine-learning practice and the classical bias-variance trade-off
    Belkin, Mikhail
    Hsu, Daniel
    Ma, Siyuan
    Mandal, Soumik
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2019, 116 (32) : 15849 - 15854
  • [3] GuacaMol: Benchmarking Models for de Novo Molecular Design
    Brown, Nathan
    Fiscato, Marco
    Segler, Marwin H. S.
    Vaucher, Alain C.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (03) : 1096 - 1108
  • [4] Chang DT, 2022, Arxiv, DOI arXiv:2205.06783
  • [5] Chen JW, 2021, Arxiv, DOI arXiv:2107.08773
  • [6] Chithrananda S, 2020, Arxiv, DOI [arXiv:2010.09885, 10.48550/arXiv.2010.09885]
  • [7] Molecular Graph Augmentation with Rings and Functional Groups
    De Grave, Kurt
    Costa, Fabrizio
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (09) : 1660 - 1668
  • [8] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [9] Duvenaudt D, 2015, ADV NEUR IN, V28
  • [10] Fang Y, 2022, AAAI CONF ARTIF INTE, P3968