Developing an in silico pipeline for faster drug candidate discovery: Virtual high throughput screening with the Signature molecular descriptor using support vector machine models

被引:31
作者
Chen, Jonathan Jun Feng [1 ]
Visco, Donald Patrick, Jr. [2 ]
机构
[1] Univ Akron, Dept Biol, 302 Buchtel Common, Akron, OH 44325 USA
[2] Univ Akron, Dept Chem & Biomol Engn, 302 Buchtel Common, Akron, OH 44325 USA
关键词
Virtual high throughput screening; QSAR; Drug discovery; CAMD; Signature; EXTENDED VALENCE SEQUENCES; FACTOR XIA INHIBITORS; RECEPTOR FLEXIBILITY; FORCE-FIELD; DESIGN; SELECTION; PROTEIN; FINGERPRINTS; METHODOLOGY; PERSPECTIVE;
D O I
10.1016/j.ces.2016.02.037
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Drug candidates make up a small portion of all possible compounds. To identify the candidates, traditional drug discovery methods like high-throughput screening test compound libraries against the target of interest. However, traditional high-throughput screening typically have a low efficiency, identifying < 1% of the tested compounds as candidates and are costly because the majority of resources are spent testing compounds inactive towards a target of interest. To increase high-throughput screening efficiency, virtual high-throughput screening emerged as a way to focus compound libraries by removing unpromising drug candidates before bench-top testing is ever started. Virtual screens are usually based on energetics of a ligand-target complex, classification based on known ligands, or a combination of the two. We propose a new ligand-based pipeline to reduce cost and increase efficiency: given a set of experimental data, the pipeline develops QSARs in the form of predictive SVM models and applies the models to virtually screen compound databases. The models obtained are based on a fragmental descriptor called Signature which has been previously shown as useful in virtual high-throughput screens. For proof-of-concept, we used our pipeline to identify inhibitors for Cathepsin L, a receptor implicated in viral disease pathways. Our first pass virtual screen identified 16 compounds, 3 of which were experimentally confirmed as active, for a hit rate of 19%. Using the experimental data from the first-pass, we retrained the models to refine their predictive ability. Our second pass virtual screen identified 12 compounds, 9 of which experimentally confirmed as active, for a hit rate of 75%. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:31 / 42
页数:12
相关论文
共 55 条
[1]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[2]   Benchmarking Study of Parameter Variation When Using Signature Fingerprints Together with Support Vector Machines [J].
Alvarsson, Jonathan ;
Eklund, Martin ;
Andersson, Claes ;
Carlsson, Lars ;
Spjuth, Ola ;
Wikberg, Jarl E. S. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (11) :3211-3217
[3]   Ligand-Based Target Prediction with Signature Fingerprints [J].
Alvarsson, Jonathan ;
Eklund, Martin ;
Engkvist, Ola ;
Spjuth, Ola ;
Carlsson, Lars ;
Wikberg, Jarl E. S. ;
Noeske, Tobias .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) :2647-2653
[4]  
[Anonymous], 2004, R J STAT SOFT, DOI DOI 10.18637/JSS.V011.I09
[5]   Strategies for learning in class imbalance problems [J].
Barandela, R ;
Sánchez, JS ;
García, V ;
Rangel, E .
PATTERN RECOGNITION, 2003, 36 (03) :849-851
[6]   Molecular similarity searching using atom environments, information-based feature selection, and a naive Bayesian classifier [J].
Bender, A ;
Mussa, HY ;
Glen, RC ;
Reiling, S .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (01) :170-178
[7]   Hit and lead generation:: Beyond high-throughput screening [J].
Bleicher, KH ;
Böhm, HJ ;
Müller, K ;
Alanine, AI .
NATURE REVIEWS DRUG DISCOVERY, 2003, 2 (05) :369-378
[8]  
Bohacek RS, 1996, MED RES REV, V16, P3, DOI 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.0.CO
[9]  
2-6
[10]   A Novel Methodology for Property-Based Molecular Design Using Multiple Topological Indices [J].
Chemmangattuvalappil, Nishanth G. ;
Eden, Mario R. .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2013, 52 (22) :7090-7103