Benchmarking methods and data sets for ligand enrichment assessment in virtual screening

被引:40
作者
Xia, Jie [1 ,2 ]
Tilahun, Ermias Lemma [2 ]
Reid, Terry-Elinor [2 ]
Zhang, Liangren [1 ]
Wang, Xiang Simon [2 ]
机构
[1] Peking Univ, Sch Pharmaceut Sci, State Key Lab Nat & Biomimet Drugs, Beijing 100191, Peoples R China
[2] Howard Univ, Coll Pharm, Dist Columbia Dev Ctr AIDS Res DC D CFAR,Mol Mode, Dept Pharmaceut Sci,Lab Cheminformat & Drug Desig, Washington, DC 20059 USA
基金
美国国家卫生研究院;
关键词
Benchmarking methodology; Decoy sets; Structure-based virtual screening; Ligand-based virtual screening; Artificial enrichment; Analogue bias; SCORING FUNCTIONS; MOLECULAR DOCKING; DRUG DISCOVERY; ACCURATE DOCKING; PERFORMANCE; OPTIMIZATION; STRATEGIES; SELECTION; AFFINITY; SHAPE;
D O I
10.1016/j.ymeth.2014.11.015
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. "analogue bias", "artificial enrichment" and "false negative". In addition, we introduce our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylases (HDACs) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The leave-one-out cross-validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased as measured by property matching, ROC curves and AUCs. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:146 / 157
页数:12
相关论文
共 113 条
[1]   ICM - A NEW METHOD FOR PROTEIN MODELING AND DESIGN - APPLICATIONS TO DOCKING AND STRUCTURE PREDICTION FROM THE DISTORTED NATIVE CONFORMATION [J].
ABAGYAN, R ;
TOTROV, M ;
KUZNETSOV, D .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1994, 15 (05) :488-506
[2]   Ligand-Based Virtual Screening Using Bayesian Networks [J].
Abdo, Ammar ;
Chen, Beining ;
Mueller, Christoph ;
Salim, Naomie ;
Willett, Peter .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (06) :1012-1020
[3]   A Comparative Assessment of Ranking Accuracies of Conventional and Machine-Learning-Based Scoring Functions for Protein-Ligand Binding Affinity Prediction [J].
Ashtawy, Hossam M. ;
Mahapatra, Nihar R. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (05) :1301-1313
[4]   Integration of virtual and high-throughput screening [J].
Bajorath, F .
NATURE REVIEWS DRUG DISCOVERY, 2002, 1 (11) :882-894
[5]  
Barbosa AJM, 2012, CURR TOP MED CHEM, V12, P866
[6]  
Bauer MR, 2013, J CHEM INF MODEL, V53, P1447, DOI [10.1021/ci400115b, 10.1021/ci400115bl]
[7]   The properties of known drugs .1. Molecular frameworks [J].
Bemis, GW ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) :2887-2893
[8]   Multiple Structures for Virtual Ligand Screening: Defining Binding Site Properties-Based Criteria to Optimize the Selection of the Query [J].
Ben Nasr, Nesrine ;
Guillemain, Helene ;
Lagarde, Nathalie ;
Zagury, Jean-Francois ;
Montes, Matthieu .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (02) :293-311
[9]   Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations [J].
Bissantz, C ;
Folkers, G ;
Rognan, D .
JOURNAL OF MEDICINAL CHEMISTRY, 2000, 43 (25) :4759-4767
[10]   Assessing the Performance of 3D Pharmacophore Models in Virtual Screening: How Good are They? [J].
Braga, Rodolpho C. ;
Andrade, Carolina H. .
CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2013, 13 (09) :1127-1138