Learning Drug Functions from Chemical Structures with Convolutional Neural Networks and Random Forests

被引:60
作者
Meyer, Jesse G. [1 ,2 ,3 ]
Liu, Shengchao [4 ,5 ]
Miller, Ian J. [3 ]
Coon, Joshua J. [1 ,2 ,3 ,5 ,6 ]
Gitter, Anthony [4 ,5 ,7 ]
机构
[1] Univ Wisconsin, Dept Chem, 1101 Univ Ave, Madison, WI 53706 USA
[2] Univ Wisconsin, Dept Biomol Chem, Madison, WI 53706 USA
[3] Univ Wisconsin, Natl Ctr Quantitat Biol Complex Syst, Madison, WI 53706 USA
[4] Univ Wisconsin, Dept Comp Sci, 1210 W Dayton St, Madison, WI 53706 USA
[5] Univ Wisconsin, Morgridge Inst Res, Madison, WI 53706 USA
[6] Univ Wisconsin, DOE Great Lakes Bioenergy Res Ctr, Madison, WI 53706 USA
[7] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI 53706 USA
关键词
GROWTH-FACTOR RECEPTOR; APPLICABILITY DOMAIN; CLASSIFICATION; DISCOVERY; TOXICITY; MODELS; SMILES; SPACE; GENERATION; REGRESSION;
D O I
10.1021/acs.jcim.9b00236
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Empirical testing of chemicals for drug efficacy costs many billions of dollars every year. The ability to predict the action of molecules in silico would greatly increase the speed and decrease the cost of prioritizing drug leads. Here, we asked whether drug function, defined as MeSH "therapeutic use" classes, can be predicted from only a chemical structure. We evaluated two chemical-structure-derived drug classification methods, chemical images with convolutional neural networks and molecular fingerprints with random forests, both of which outperformed previous predictions that used drug-induced transcriptomic changes as chemical representations. This suggests that the structure of a chemical contains at least as much information about its therapeutic use as the transcriptional cellular response to that chemical. Furthermore, because training data based on chemical structure is not limited to a small set of molecules for which transcriptomic measurements are available, our strategy can leverage more training data to significantly improve predictive accuracy to 83-88%. Finally, we explore use of these models for prediction of side effects and drug-repurposing opportunities and demonstrate the effectiveness of this modeling strategy for multilabel classification.
引用
收藏
页码:4438 / 4449
页数:12
相关论文
共 72 条
[21]   Deep learning for computational chemistry [J].
Goh, Garrett B. ;
Hodas, Nathan O. ;
Vishnu, Abhinav .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2017, 38 (16) :1291-1307
[22]  
Goh GB., 2017, SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties
[23]   Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules [J].
Gomez-Bombarelli, Rafael ;
Wei, Jennifer N. ;
Duvenaud, David ;
Hernandez-Lobato, Jose Miguel ;
Sanchez-Lengeling, Benjamin ;
Sheberla, Dennis ;
Aguilera-Iparraguirre, Jorge ;
Hirzel, Timothy D. ;
Adams, Ryan P. ;
Aspuru-Guzik, Alan .
ACS CENTRAL SCIENCE, 2018, 4 (02) :268-276
[24]   Concept-Based Semi-Automatic Classification of Drugs [J].
Gurulingappa, Harsha ;
Kolarik, Corinna ;
Hofmann-Apitius, Martin ;
Fluck, Juliane .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (08) :1986-1992
[25]   QSAR applicability domain estimation by projection of the training set in descriptor space: A review [J].
Jaworska, J ;
Nikolova-Jeliazkova, N ;
Aldenberg, T .
ATLA-ALTERNATIVES TO LABORATORY ANIMALS, 2005, 33 (05) :445-459
[26]  
Jin WG, 2018, PR MACH LEARN RES, V80
[27]   Molecular graph convolutions: moving beyond fingerprints [J].
Kearnes, Steven ;
McCloskey, Kevin ;
Berndl, Marc ;
Pande, Vijay ;
Riley, Patrick .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2016, 30 (08) :595-608
[28]   PubChem Substance and Compound databases [J].
Kim, Sunghwan ;
Thiessen, Paul A. ;
Bolton, Evan E. ;
Chen, Jie ;
Fu, Gang ;
Gindulyte, Asta ;
Han, Lianyi ;
He, Jane ;
He, Siqian ;
Shoemaker, Benjamin A. ;
Wang, Jiyao ;
Yu, Bo ;
Zhang, Jian ;
Bryant, Stephen H. .
NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) :D1202-D1213
[29]   Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets [J].
Korotcov, Alexandru ;
Tkachenko, Valery ;
Russo, Daniel P. ;
Ekins, Sean .
MOLECULAR PHARMACEUTICS, 2017, 14 (12) :4462-4475
[30]  
Landrum G., 2016, RDKit: Open-Source cheminformatics software