AutoQSAR: an automated machine learning tool for best-practice quantitative structure-activity relationship modeling

被引:117
作者
Dixon, Steven L. [1 ]
Duan, Jianxin [2 ]
Smith, Ethan [3 ]
Von Bargen, Christopher D. [1 ]
Sherman, Woody [1 ]
Repasky, Matthew P. [3 ]
机构
[1] Schrodinger Inc, 120 West 45th St, New York, NY 10036 USA
[2] Schrodinger GmbH, Dynamostr 13, D-68165 Mannheim, Baden Wurttembe, Germany
[3] Schrodinger Inc, 101 SW Main St, Portland, OR 97204 USA
关键词
binding affinity prediction; blood-brain barrier permeability; carcinogenicity; fish bioconcentration factor; mutagenicity; QSAR; solubility; QSAR MODEL; PREDICTION; 2D; FINGERPRINTS; VALIDATION;
D O I
10.4155/fmc-2016-0093
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Aim: We introduce AutoQSAR, an automated machine-learning application to build, validate and deploy quantitative structure-activity relationship (QSAR) models. Methodology/results: The process of descriptor generation, feature selection and the creation of a large number of QSAR models has been automated into a single workflow within AutoQSAR. The models are built using a variety of machine-learning methods, and each model is scored using a novel approach. Effectiveness of the method is demonstrated through comparison with literature QSAR models using identical datasets for six end points: protein-ligand binding affinity, solubility, blood-brain barrier permeability, carcinogenicity, mutagenicity and bioaccumulation in fish. Conclusion: AutoQSAR demonstrates similar or better predictive performance as compared with published results for four of the six endpoints while requiring minimal human time and expertise.
引用
收藏
页码:1825 / 1839
页数:15
相关论文
共 40 条
[21]   Prediction of hydrophobic (lipophilic) properties of small organic molecules using fragmental methods: An analysis of ALOGP and CLOGP methods [J].
Ghose, AK ;
Viswanadhan, VN ;
Wendoloski, JJ .
JOURNAL OF PHYSICAL CHEMISTRY A, 1998, 102 (21) :3762-3772
[22]   ELECTROTOPOLOGICAL STATE INDEXES FOR ATOM TYPES - A NOVEL COMBINATION OF ELECTRONIC, TOPOLOGICAL, AND VALENCE STATE INFORMATION [J].
HALL, LH ;
KIER, LB .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1995, 35 (06) :1039-1045
[23]   Improved naive Bayesian modeling of numerical data for absorption, distribution, metabolism and excretion (ADME) property prediction [J].
Klon, Anthony E. ;
Lowrie, Jeffrey F. ;
Diller, David J. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (05) :1945-1956
[24]   Synthesis of many different types of organic small molecules using one automated process [J].
Li, Junqi ;
Ballmer, Steven G. ;
Gillis, Eric P. ;
Fujii, Seiko ;
Schmidt, Michael J. ;
Palazzolo, Andrea M. E. ;
Lehmann, Jonathan W. ;
Morehouse, Greg F. ;
Burke, Martin D. .
SCIENCE, 2015, 347 (6227) :1221-1226
[25]   Assessment and validation of the CAESAR predictive model for bioconcentration factor (BCF) in fish [J].
Lombardo, Anna ;
Roncaglioni, Alessandra ;
Boriani, Elena ;
Milan, Chiara ;
Benfenati, Emilio .
CHEMISTRY CENTRAL JOURNAL, 2010, 4
[26]  
Piegorsch WW, 1991, MEASURING INTRAASSAY
[27]   Predictivity of Simulated ADME AutoQSAR Models over Time [J].
Rodgers, Sarah L. ;
Davis, Andrew M. ;
Tomkinson, Nick P. ;
van de Waterbeemd, Han .
MOLECULAR INFORMATICS, 2011, 30 (2-3) :256-266
[28]   Using extended-connectivity fingerprints with Laplacian-modified Bayesian analysis in high-throughput screening follow-up [J].
Rogers, D ;
Brown, RD ;
Hahn, M .
JOURNAL OF BIOMOLECULAR SCREENING, 2005, 10 (07) :682-686
[29]   Large-Scale Systematic Analysis of 2D Fingerprint Methods and Parameters to Improve Virtual Screening Enrichments [J].
Sastry, Madhavi ;
Lowrie, Jeffrey F. ;
Dixon, Steven L. ;
Sherman, Woody .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (05) :771-784
[30]   AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment [J].
Stalring, Jonna C. ;
Carlsson, Lars A. ;
Almeida, Pedro ;
Boyer, Scott .
JOURNAL OF CHEMINFORMATICS, 2011, 3