A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility

被引:117
作者
Tang, Bowen [1 ,2 ,3 ]
Kramer, Skyler T. [2 ,3 ]
Fang, Meijuan [1 ]
Qiu, Yingkun [1 ]
Wu, Zhen [1 ]
Xu, Dong [2 ,3 ]
机构
[1] Xiamen Univ, Sch Pharmaceut Sci, Fujian Prov Key Lab Innovat Drug Target Res, Xiamen 361000, Peoples R China
[2] Univ Missouri, Informat Inst, Dept Elect Engn & Comp Sci, Columbia, MO 65211 USA
[3] Univ Missouri, Christopher S Bond Life Sci Ctr, Columbia, MO 65211 USA
基金
美国国家卫生研究院;
关键词
Message passing network; Attention mechanism; Deep learning; Lipophilicity; Aqueous solubility; GRAPH; DESCRIPTORS;
D O I
10.1186/s13321-020-0414-z
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Efficient and accurate prediction of molecular properties, such as lipophilicity and solubility, is highly desirable for rational compound design in chemical and pharmaceutical industries. To this end, we build and apply a graph-neural-network framework called self-attention-based message-passing neural network (SAMPN) to study the relationship between chemical properties and structures in an interpretable way. The main advantages of SAMPN are that it directly uses chemical graphs and breaks the black-box mold of many machine/deep learning methods. Specifically, its attention mechanism indicates the degree to which each atom of the molecule contributes to the property of interest, and these results are easily visualized. Further, SAMPN outperforms random forests and the deep learning framework MPN from Deepchem. In addition, another formulation of SAMPN (Multi-SAMPN) can simultaneously predict multiple chemical properties with higher accuracy and efficiency than other models that predict one specific chemical property. Moreover, SAMPN can generate chemically visible and interpretable results, which can help researchers discover new pharmaceuticals and materials. The source code of the SAMPN prediction pipeline is freely available at Github (https://github.com/tbwxmu/SAMPN).
引用
收藏
页数:9
相关论文
共 40 条
[1]  
[Anonymous], EXP IN VITR DMPK PHY
[2]  
[Anonymous], PHARMACOPHORE
[3]   The influence of lipophilicity in drug discovery and design [J].
Arnott, John A. ;
Planey, Sonia Lobo .
EXPERT OPINION ON DRUG DISCOVERY, 2012, 7 (10) :863-875
[4]   Hyperopt: A Python library for model selection and hyperparameter optimization [J].
Bergstra, James ;
Komer, Brent ;
Eliasmith, Chris ;
Yamins, Dan ;
Cox, David D .
Computational Science and Discovery, 2015, 8 (01)
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   The rise of deep learning in drug discovery [J].
Chen, Hongming ;
Engkvist, Ola ;
Wang, Yinhai ;
Olivecrona, Marcus ;
Blaschke, Thomas .
DRUG DISCOVERY TODAY, 2018, 23 (06) :1241-1250
[7]   QSAR Modeling: Where Have You Been? Where Are You Going To? [J].
Cherkasov, Artem ;
Muratov, Eugene N. ;
Fourches, Denis ;
Varnek, Alexandre ;
Baskin, Igor I. ;
Cronin, Mark ;
Dearden, John ;
Gramatica, Paola ;
Martin, Yvonne C. ;
Todeschini, Roberto ;
Consonni, Viviana ;
Kuz'min, Victor E. ;
Cramer, Richard ;
Benigni, Romualdo ;
Yang, Chihae ;
Rathman, James ;
Terfloth, Lothar ;
Gasteiger, Johann ;
Richard, Ann ;
Tropsha, Alexander .
JOURNAL OF MEDICINAL CHEMISTRY, 2014, 57 (12) :4977-5010
[8]   A graph-convolutional neural network model for the prediction of chemical reactivity [J].
Coley, Connor W. ;
Jin, Wengong ;
Rogers, Luke ;
Jamison, Timothy F. ;
Jaakkola, Tommi S. ;
Green, William H. ;
Barzilay, Regina ;
Jensen, Klavs F. .
CHEMICAL SCIENCE, 2019, 10 (02) :370-377
[9]  
Duvenaudt D, 2015, ADV NEUR IN, V28
[10]  
Feinberg E. N., 2019, ARXIV190311789