Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery

被引:109
作者
Bosc, Nicolas [1 ]
Atkinson, Francis [1 ]
Felix, Eloy [1 ]
Gaulton, Anna [1 ]
Hersey, Anne [1 ]
Leach, Andrew R. [1 ]
机构
[1] European Bioinformat Inst EMBL EBI, Chemogen Team, Wellcome Genome Campus, Cambridge CB10 1SD, England
基金
欧盟地平线“2020”; 英国惠康基金;
关键词
QSAR; Mondrian conformal prediction; ChEMBL; Classification models; Cheminformatics; APPLICABILITY DOMAIN; CLASSIFICATION; DATABASE; CHEMICALS; DESIGN;
D O I
10.1186/s13321-018-0325-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Structure-activity relationship modelling is frequently used in the early stage of drug discovery to assess the activity of a compound on one or several targets, and can also be used to assess the interaction of compounds with liability targets. QSAR models have been used for these and related applications over many years, with good success. Conformal prediction is a relatively new QSAR approach that provides information on the certainty of a prediction, and so helps in decision-making. However, it is not always clear how best to make use of this additional information. In this article, we describe a case study that directly compares conformal prediction with traditional QSAR methods for large-scale predictions of target-ligand binding. The ChEMBL database was used to extract a data set comprising data from 550 human protein targets with different bioactivity profiles. For each target, a QSAR model and a conformal predictor were trained and their results compared. The models were then evaluated on new data published since the original models were built to simulate a real world application. The comparative study highlights the similarities between the two techniques but also some differences that it is important to bear in mind when the methods are used in practical drug discovery applications.
引用
收藏
页数:16
相关论文
共 47 条
[1]  
Ahlberg E, 2018, Conformal and Probabilistic Prediction and Applications, P132
[2]  
[Anonymous], MONDRIAN CONFIDENCE
[3]  
[Anonymous], P 30 C UNC ART INT
[4]  
[Anonymous], NUCL ACIDS RES
[5]  
Arvidsson S, 2017, Conformal and Probabilistic Prediction and Applications, P118
[6]   The ChEMBL bioactivity database: an update [J].
Bento, A. Patricia ;
Gaulton, Anna ;
Hersey, Anne ;
Bellis, Louisa J. ;
Chambers, Jon ;
Davies, Mark ;
Krueger, Felix A. ;
Light, Yvonne ;
Mak, Lora ;
McGlinchey, Shaun ;
Nowotka, Michal ;
Papadatos, George ;
Santos, Rita ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D1083-D1090
[7]   Safety, Pharmacokinetics, and Pharmacodynamics Study in Healthy Subjects of Oral NEO6860, a Modality Selective Transient Receptor Potential Vanilloid Subtype 1 Antagonist [J].
Brown, William ;
Leff, Richard L. ;
Griffin, Andrew ;
Hossack, Stuart ;
Aubray, Roxane ;
Walker, Philippe ;
Chiche, Dan A. .
JOURNAL OF PAIN, 2017, 18 (06) :726-738
[8]  
Buendia R., 2018, P 7 WORKSH CONF PROB, V91, P201
[9]   Classification ensembles for unbalanced class sizes in predictive toxicology [J].
Chen, JJ ;
Tsai, CA ;
Young, JF ;
Kodell, RL .
SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2005, 16 (06) :517-529
[10]   QSAR Modeling: Where Have You Been? Where Are You Going To? [J].
Cherkasov, Artem ;
Muratov, Eugene N. ;
Fourches, Denis ;
Varnek, Alexandre ;
Baskin, Igor I. ;
Cronin, Mark ;
Dearden, John ;
Gramatica, Paola ;
Martin, Yvonne C. ;
Todeschini, Roberto ;
Consonni, Viviana ;
Kuz'min, Victor E. ;
Cramer, Richard ;
Benigni, Romualdo ;
Yang, Chihae ;
Rathman, James ;
Terfloth, Lothar ;
Gasteiger, Johann ;
Richard, Ann ;
Tropsha, Alexander .
JOURNAL OF MEDICINAL CHEMISTRY, 2014, 57 (12) :4977-5010