Beyond the Scope of Free-Wilson Analysis: Building Interpretable QSAR Models with Machine Learning Algorithms

被引:34
作者
Chen, Hongming
Carlsson, Lars [1 ]
Eriksson, Mats
Varkonyi, Peter
Norinder, Ulf [3 ]
Nilsson, Ingemar [2 ]
机构
[1] AstraZeneca R&D, Global Safety Assessment, Computat Toxicol, Gothenburg, Sweden
[2] AstraZeneca R&D, CVGI Innovat Med, Gothenburg, Sweden
[3] AstraZeneca R&D Sodertalje, CNSP Innovat Med, Sodertalje, Sweden
关键词
BIOLOGICAL-ACTIVITY; DRUG DESIGN; PREDICTION; IMPROVE; CONSTANTS; LIBRARY; BINDING; QSPR;
D O I
10.1021/ci4001376
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A novel methodology was developed to build Free-Wilson like local QSAR Models by combining R-group signatures and the SVM algorithm Unlike Free-Wilson analysis this method is able to make predictions for compounds with R-groups not present in a training set. Eleven public data sets were chosen as test cases for comparing the performance of our new method with several other traditional modeling strategies, including Free-Wilson analysis. Our results show that the R-group signature SVM models achieve better prediction accuracy compared with Free-Wilson signature models are also comparable to the models using ECFP6 fingerprints and signatures for the whole compound. Most importantly, R-group contributions to the SVM model can be obtained by calculating the gradient for R-group signatures. For most of the studied data sets, a significant correlation with that of a corresponding Free-Wilson analysis is shown. These results suggest that the R-group contribution can be used to interpret bioactivity data and highlight that the R-group signature based SVM modeling method is as interpretable as Free-Wilson analysis. Hence the signature SVM model can be a useful modeling tool for any drug discover project.
引用
收藏
页码:1324 / 1336
页数:13
相关论文
共 41 条
[1]   Making medicinal chemistry more effective-application of Lean Sigma to improve processes, speed and quality [J].
Andersson, Shalini ;
Armstrong, Alan ;
Bjore, Annika ;
Bowker, Sue ;
Chapman, Steve ;
Davies, Rob ;
Donald, Craig ;
Egner, Bryan ;
Elebring, Thomas ;
Holmqvist, Sara ;
Inghardt, Tord ;
Johannesson, Petra ;
Johansson, Magnus ;
Johnstone, Craig ;
Kemmitt, Paul ;
Kihlberg, Jan ;
Korsgren, Pernilla ;
Lemurell, Malin ;
Moore, Jane ;
Pettersson, Jonas A. ;
Pointon, Helen ;
Ponten, Fritiof ;
Schofield, Paul ;
Selmi, Nidhal ;
Whittamore, Paul .
DRUG DISCOVERY TODAY, 2009, 14 (11-12) :598-604
[2]  
[Anonymous], UNITY 2D FING
[3]  
[Anonymous], PIP PIL VERS 8 5
[4]  
[Anonymous], GOSTAR DAT 2012
[5]  
[Anonymous], JMP VERS 1 0
[6]  
Bruneau P., J CHEM INF COMPUT SC, V41, P1605
[7]   Interpretation of Nonlinear QSAR Models Applied to Ames Mutagenicity Data [J].
Carlsson, Lars ;
Helgee, Ernst Ahlberg ;
Boyer, Scott .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (11) :2551-2558
[8]   Learning to improve the decision-making process in research [J].
Chadwick, A ;
Hajek, M .
DRUG DISCOVERY TODAY, 2004, 9 (06) :251-257
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482