Toward Interpretable Machine Learning Models for Materials Discovery

被引:26
作者
Mikulskis, Paulius [1 ]
Alexander, Morgan R. [1 ]
Winkler, David Alan [1 ,2 ,3 ,4 ]
机构
[1] Univ Nottingham, Sch Pharm, Nottingham NG7 2RD, England
[2] Monash Univ, Monash Inst Pharmaceut Sci, Parkville, Vic 3052, Australia
[3] La Trobe Univ, La Trobe Inst Mol Sci, Kingsbury Dr, Bundoora, Vic 3086, Australia
[4] CSIRO Mfg, Clayton, Vic 3168, Australia
基金
英国工程与自然科学研究理事会; 英国惠康基金;
关键词
interpretability; machine learning; materials designs; molecular descriptors; structure-property relationships;
D O I
10.1002/aisy.201900045
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) and artificial intelligence (AI) methods for modeling useful materials properties are now important technologies for rational design and optimization of bespoke functional materials. Although these methods make good predictions of the properties of new materials, current modeling methods use efficient but rather arcane (difficult-to-interpret) mathematical features (descriptors) to characterize materials. Data-driven ML models are considerably more useful if more chemically interpretable descriptors are used to train them, as long as these models also accurately recapitulate the properties of materials in training and test sets used to generate and validate the models. Herein, how a particular type of molecular fragment descriptor, the signature descriptor, achieves these joint aims of accuracy and interpretability is described. Seven different types of materials properties are modeled, and the performance of models generated from signature descriptors is compared with those generated by widely used Dragon descriptors. The key descriptors in the model represent functionalities that make chemical sense. Mapping these fragments back on to exemplar materials provides a useful guide to chemists wishing to modify promising lead materials to improve their properties. This is one of the first applications of signature descriptors to the modeling of complex materials properties.
引用
收藏
页数:16
相关论文
共 45 条
[1]   Prediction of intrinsic viscosity in polymer-solvent combinations using a QSPR model [J].
Afantitis, Antreas ;
Melagraki, Georgia ;
Sarimveis, Haralambos ;
Koutentis, Panayiotis A. ;
Markopoulos, John ;
Igglessi-Markopoulou, Olga .
POLYMER, 2006, 47 (09) :3240-3248
[2]   Designing novel polymers with targeted properties using the signature molecular descriptor brown [J].
Brown, WM ;
Martin, S ;
Rintoul, MD ;
Faulon, JL .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (02) :826-835
[3]   Optimal Sparse Descriptor Selection for QSAR Using Bayesian Methods [J].
Burden, F. R. ;
Winkler, D. A. .
QSAR & COMBINATORIAL SCIENCE, 2009, 28 (6-7) :645-653
[4]  
Carbonell P, 2013, J CHEM INF MODEL, V53, P887, DOI [10.1021/ci300584r, 10.1021/ci300584rl1]
[5]   Shape signatures: New descriptors for predicting cardiotoxicity in silico [J].
Chekmarev, Dmitriy S. ;
Kholodovych, Vladyslav ;
Balakin, Konstantin V. ;
Ivanenkov, Yan ;
Ekins, Sean ;
Welsh, William J. .
CHEMICAL RESEARCH IN TOXICOLOGY, 2008, 21 (06) :1304-1314
[6]   Developing an in silico pipeline for faster drug candidate discovery: Virtual high throughput screening with the Signature molecular descriptor using support vector machine models [J].
Chen, Jonathan Jun Feng ;
Visco, Donald Patrick, Jr. .
CHEMICAL ENGINEERING SCIENCE, 2017, 159 :31-42
[7]   QSAR Modeling: Where Have You Been? Where Are You Going To? [J].
Cherkasov, Artem ;
Muratov, Eugene N. ;
Fourches, Denis ;
Varnek, Alexandre ;
Baskin, Igor I. ;
Cronin, Mark ;
Dearden, John ;
Gramatica, Paola ;
Martin, Yvonne C. ;
Todeschini, Roberto ;
Consonni, Viviana ;
Kuz'min, Victor E. ;
Cramer, Richard ;
Benigni, Romualdo ;
Yang, Chihae ;
Rathman, James ;
Terfloth, Lothar ;
Gasteiger, Johann ;
Richard, Ann ;
Tropsha, Alexander .
JOURNAL OF MEDICINAL CHEMISTRY, 2014, 57 (12) :4977-5010
[8]   The signature molecular descriptor - 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides [J].
Churchwell, CJ ;
Rintoul, MD ;
Martin, S ;
Visco, DP ;
Kotu, A ;
Larson, RS ;
Sillerud, LO ;
Brown, DC ;
Faulon, JL .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2004, 22 (04) :263-273
[9]   ESOL: Estimating aqueous solubility directly from molecular structure [J].
Delaney, JS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :1000-1005
[10]   Modelling and Prediction of Bacterial Attachment to Polymers [J].
Epa, V. C. ;
Hook, A. L. ;
Chang, C. ;
Yang, J. ;
Langer, R. ;
Anderson, D. G. ;
Williams, P. ;
Davies, M. C. ;
Alexander, M. R. ;
Winkler, D. A. .
ADVANCED FUNCTIONAL MATERIALS, 2014, 24 (14) :2085-2093