Developing an in silico pipeline for faster drug candidate discovery: Virtual high throughput screening with the Signature molecular descriptor using support vector machine models

被引:31
作者
Chen, Jonathan Jun Feng [1 ]
Visco, Donald Patrick, Jr. [2 ]
机构
[1] Univ Akron, Dept Biol, 302 Buchtel Common, Akron, OH 44325 USA
[2] Univ Akron, Dept Chem & Biomol Engn, 302 Buchtel Common, Akron, OH 44325 USA
关键词
Virtual high throughput screening; QSAR; Drug discovery; CAMD; Signature; EXTENDED VALENCE SEQUENCES; FACTOR XIA INHIBITORS; RECEPTOR FLEXIBILITY; FORCE-FIELD; DESIGN; SELECTION; PROTEIN; FINGERPRINTS; METHODOLOGY; PERSPECTIVE;
D O I
10.1016/j.ces.2016.02.037
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Drug candidates make up a small portion of all possible compounds. To identify the candidates, traditional drug discovery methods like high-throughput screening test compound libraries against the target of interest. However, traditional high-throughput screening typically have a low efficiency, identifying < 1% of the tested compounds as candidates and are costly because the majority of resources are spent testing compounds inactive towards a target of interest. To increase high-throughput screening efficiency, virtual high-throughput screening emerged as a way to focus compound libraries by removing unpromising drug candidates before bench-top testing is ever started. Virtual screens are usually based on energetics of a ligand-target complex, classification based on known ligands, or a combination of the two. We propose a new ligand-based pipeline to reduce cost and increase efficiency: given a set of experimental data, the pipeline develops QSARs in the form of predictive SVM models and applies the models to virtually screen compound databases. The models obtained are based on a fragmental descriptor called Signature which has been previously shown as useful in virtual high-throughput screens. For proof-of-concept, we used our pipeline to identify inhibitors for Cathepsin L, a receptor implicated in viral disease pathways. Our first pass virtual screen identified 16 compounds, 3 of which were experimentally confirmed as active, for a hit rate of 19%. Using the experimental data from the first-pass, we retrained the models to refine their predictive ability. Our second pass virtual screen identified 12 compounds, 9 of which experimentally confirmed as active, for a hit rate of 75%. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:31 / 42
页数:12
相关论文
共 55 条
  • [1] Applying support vector machines to imbalanced datasets
    Akbani, R
    Kwek, S
    Japkowicz, N
    [J]. MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 39 - 50
  • [2] Benchmarking Study of Parameter Variation When Using Signature Fingerprints Together with Support Vector Machines
    Alvarsson, Jonathan
    Eklund, Martin
    Andersson, Claes
    Carlsson, Lars
    Spjuth, Ola
    Wikberg, Jarl E. S.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (11) : 3211 - 3217
  • [3] Ligand-Based Target Prediction with Signature Fingerprints
    Alvarsson, Jonathan
    Eklund, Martin
    Engkvist, Ola
    Spjuth, Ola
    Carlsson, Lars
    Wikberg, Jarl E. S.
    Noeske, Tobias
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) : 2647 - 2653
  • [4] [Anonymous], 2004, R J STAT SOFT, DOI DOI 10.18637/JSS.V011.I09
  • [5] Strategies for learning in class imbalance problems
    Barandela, R
    Sánchez, JS
    García, V
    Rangel, E
    [J]. PATTERN RECOGNITION, 2003, 36 (03) : 849 - 851
  • [6] Molecular similarity searching using atom environments, information-based feature selection, and a naive Bayesian classifier
    Bender, A
    Mussa, HY
    Glen, RC
    Reiling, S
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (01): : 170 - 178
  • [7] Hit and lead generation:: Beyond high-throughput screening
    Bleicher, KH
    Böhm, HJ
    Müller, K
    Alanine, AI
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2003, 2 (05) : 369 - 378
  • [8] Bohacek RS, 1996, MED RES REV, V16, P3, DOI 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.0.CO
  • [9] 2-6
  • [10] A Novel Methodology for Property-Based Molecular Design Using Multiple Topological Indices
    Chemmangattuvalappil, Nishanth G.
    Eden, Mario R.
    [J]. INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2013, 52 (22) : 7090 - 7103