Machine-guided representation for accurate graph-based molecular machine learning

被引:29
作者
Na, Gyoung S. [1 ]
Chang, Hyunju [1 ]
Kim, Hyun Woo [1 ]
机构
[1] Korea Res Inst Chem Technol KRICT, 141 Gajeong Ro, Daejeon, South Korea
关键词
NEURAL-NETWORKS; PREDICTION; DATABASE;
D O I
10.1039/d0cp02709j
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
In chemistry-related fields, graph-based machine learning has received significant attention as atoms and their chemical bonds in a molecule can be represented as a mathematical graph. However, many molecular properties are sensitive to changes in the molecular structure. For this reason, molecules have a mixed distribution for their molecular properties in molecular space, and it consequently makes molecular machine learning difficult. However, this problem has not been investigated in either chemistry or computer science. To tackle this problem, we propose a robust and machine-guided molecular representation based on deep metric learning (DML), which automatically generates an optimal representation for a given dataset. To this end, we first adopt DML for molecular machine learning by integrating it with graph neural networks (GNNs) and devising a new objective function for representation learning. In experimental evaluations, machine learning algorithms with the proposed method achieved better prediction accuracy than state-of-the-art GNNs. Furthermore, the proposed method was also effective on extremely small datasets, and this result is impressive because many real world applications suffer from a lack of training data.
引用
收藏
页码:18526 / 18535
页数:10
相关论文
共 52 条
[1]  
Agarap A F., 2018, DEEP LEARNING USING, DOI DOI 10.48550/ARXIV.1803.08375
[2]  
[Anonymous], 2015, PROC INT C MACH LEAR
[3]  
[Anonymous], 2017, INT C LEARNING REPRE
[4]  
[Anonymous], 2006, C NEUR INF PROC SYST
[5]  
[Anonymous], 2015, P IEEE C COMP VIS PA
[6]  
[Anonymous], 2018, C NEUR INF PROC SYST
[7]  
[Anonymous], 2015, C NEUR INF PROC SYST
[8]   On representing chemical environments [J].
Bartok, Albert P. ;
Kondor, Risi ;
Csanyi, Gabor .
PHYSICAL REVIEW B, 2013, 87 (18)
[9]   Atom-centered symmetry functions for constructing high-dimensional neural network potentials [J].
Behler, Joerg .
JOURNAL OF CHEMICAL PHYSICS, 2011, 134 (07)
[10]   970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13 [J].
Blum, Lorenz C. ;
Reymond, Jean-Louis .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2009, 131 (25) :8732-+