TrimNet: learning molecular representation from triplet messages for biomedicine

被引:57
作者
Li, Pengyong [2 ]
Li, Yuquan [1 ]
Hsieh, Chang-Yu [3 ]
Zhang, Shengyu [4 ]
Liu, Xianggen [5 ]
Liu, Huanxiang [6 ]
Song, Sen [5 ]
Yao, Xiaojun [7 ]
机构
[1] Lanzhou Univ, Coll Chem & Chem Engn, Lanzhou 730000, Peoples R China
[2] Tsinghua Univ, Dept Biomed Engn, Beijing 100084, Peoples R China
[3] Tencent Quantum Lab, Shenyang, Peoples R China
[4] Tencent, Shenyang, Peoples R China
[5] Tsinghua Univ, Beijing, Peoples R China
[6] Lanzhou Univ, Lanzhou, Peoples R China
[7] Lanzhou Univ, Analyt Chem & Chemoinformat, Lanzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
deep learning; molecular representation; molecular property; compound-protein interaction; computational method; graph neural networks; INTERACTION PREDICTION; NETWORK; DESIGN;
D O I
10.1093/bib/bbaa266
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Computational methods accelerate drug discovery and play an important role in biomedicine, such as molecular property prediction and compound-protein interaction (CPI) identification. A key challenge is to learn useful molecular representation. In the early years, molecular properties are mainly calculated by quantum mechanics or predicted by traditional machine learning methods, which requires expert knowledge and is often labor-intensive. Nowadays, graph neural networks have received significant attention because of the powerful ability to learn representation from graph data. Nevertheless, current graph-based methods have some limitations that need to be addressed, such as large-scale parameters and insufficient bond information extraction. Results: In this study, we proposed a graph-based approach and employed a novel triplet message mechanism to learn molecular representation efficiently, named triplet message networks (TrimNet). We show that TrimNet can accurately complete multiple molecular representation learning tasks with significant parameter reduction, including the quantum properties, bioactivity, physiology and CPI prediction. In the experiments, TrimNet outperforms the previous state-of-the-art method by a significant margin on various datasets. Besides the few parameters and high prediction accuracy, TrimNet could focus on the atoms essential to the target properties, providing a clear interpretation of the prediction tasks. These advantages have established TrimNet as a powerful and useful computational tool in solving the challenging problem of molecular representation learning.
引用
收藏
页数:10
相关论文
共 63 条
[1]  
[Anonymous], 2014, ABS14105401 CORR
[2]  
Bahdanau D, 2015, INT C LEARN REPR BAN
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Machine learning for molecular and materials science [J].
Butler, Keith T. ;
Davies, Daniel W. ;
Cartwright, Hugh ;
Isayev, Olexandr ;
Walsh, Aron .
NATURE, 2018, 559 (7715) :547-555
[5]  
Chen DL, 2020, AAAI CONF ARTIF INTE, V34, P3438
[6]   Machine Learning for Drug-Target Interaction Prediction [J].
Chen, Ruolan ;
Liu, Xiangrong ;
Jin, Shuting ;
Lin, Jiawei ;
Liu, Juan .
MOLECULES, 2018, 23 (09)
[7]  
Chung J., 2014, PREPRINT
[8]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[9]  
Curtarolo S, 2013, NAT MATER, V12, P191, DOI [10.1038/NMAT3568, 10.1038/nmat3568]
[10]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848