Pros and cons of virtual screening based on public "Big Data": In silico mining for new bromodomain inhibitors

被引:13
作者
Casciuc, Iuri [1 ]
Horvath, Dragos [1 ]
Gryniukova, Anastasiia [2 ]
Tolmachova, Kateryna A. [3 ,4 ]
Vasylchenko, Oleksandr V. [3 ]
Borysko, Petro [2 ]
Moroz, Yurii S. [5 ,6 ]
Bajorath, Juergen [7 ]
Varnek, Alexandre [1 ]
机构
[1] Univ Strasbourg, Fac Chem, Lab Chemoinformat, 4 Blaise Pascal Str, F-67081 Strasbourg, France
[2] Bienta Enamine Ltd, Chervonotkatska St 78, UA-02094 Kiev, Ukraine
[3] Enamine Ltd, Chervonotkatska St 78, UA-02094 Kiev, Ukraine
[4] NAS Ukraine, Inst Bioorgan Chem & Petrochem, Murmanska St `, UA-02660 Kiev, Ukraine
[5] Natl Taras Shevchenko Univ Kyiv, Volodymyrska St 60, UA-01601 Kiev, Ukraine
[6] Chemspace, Ilukstes Iela 38-5, LV-1082 Riga, Latvia
[7] Univ Bonn, Unit Chem Biol & Med Chem, Limes, B IT, Bonn, Germany
关键词
Bromodomain BRD4 binders; Generative topographic mapping; Virtual screening; Classification models; Ligand-based pharmacophores; Docking; PROTEIN-LIGAND ENTITIES; S4MPLE-SAMPLER; FRAGMENT; DOCKING; ISIDA;
D O I
10.1016/j.ejmech.2019.01.010
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The Virtual Screening (VS) study described herein aimed at detecting novel Bromodomain BRD4 binders and relied on knowledge from public databases (ChEMBL, REAXYS) to establish a battery of predictive models of BRD activity for in silico selection of putative ligands. Beyond the actual discovery of new BRD ligands, this represented an opportunity to practically estimate the actual usefulness of public domain "Big Data" for robust predictive model building. Obtained models were used to virtually screen a collection of 2 million compounds from the Enamine company collection. This industrial partner then experimentally screened a subset of 2992 molecules selected by the VS procedure for their high likelihood to be active. Twenty nine confirmed hits were detected after experimental testing, representing 1% of the selected candidates. As a general conclusion, this study emphasizes once more that public structure-activity databases are nowadays key assets in drug discovery. Their usefulness is however limited by the state-of-the-art knowledge harvested so far by published studies. Target-specific structure-activity information is rarely rich enough, and its heterogeneity makes it extremely difficult to exploit in rational drug design. Furthermore, published affinity measures serving to build models selecting compounds to be experimentally screened may not be well correlated with the experimental hit selection criterion (in practice, often imposed by equipment constraints). Nevertheless, a robust 2.6 fold increase in hit rate with respect to an equivalent, random screening campaign showed that machine learning is able to extract some real knowledge in spite of all the noise in structure-activity data. (C) 2019 Elsevier Masson SAS. All rights reserved.
引用
收藏
页码:258 / 272
页数:15
相关论文
共 23 条
[1]   UniProt: the universal protein knowledgebase [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Bye-A-Jee, Hema ;
Cowley, Andrew ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Castro, Leyla Garcia ;
Figueira, Luis ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzalez, Daniel ;
Hatton-Ellis, Emma ;
Li, Weizhong ;
Liu, Wudong ;
Lopez, Rodrigo ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Nightingale, Andrew ;
Palka, Barbara ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Speretta, Elena ;
Turner, Edward ;
Tyagi, Nidhi ;
Volynkin, Vladimir ;
Wardell, Tony ;
Warner, Kate ;
Watkins, Xavier ;
Zaru, Rossana ;
Zellner, Hermann ;
Xenarios, Ioannis .
NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) :D158-D169
[2]   GTM: The generative topographic mapping [J].
Bishop, CM ;
Svensen, M ;
Williams, CKI .
NEURAL COMPUTATION, 1998, 10 (01) :215-234
[3]   Straightforward hit identification approach in fragment-based discovery of bromodomain-containing protein 4 (BRD4) inhibitors [J].
Borysko, Petro ;
Moroz, Yurii S. ;
Vasylchenko, Oleksandr V. ;
Hurmach, Vasyl V. ;
Starodubtseva, Anastasia ;
Stefanishena, Natalia ;
Nesteruk, Kateryna ;
Zozulya, Sergey ;
Kondratov, Ivan S. ;
Grygorenko, Oleksandr O. .
BIOORGANIC & MEDICINAL CHEMISTRY, 2018, 26 (12) :3399-3405
[4]   The multi-tasking P-TEFb complex [J].
Bres, Vanessa ;
Yoh, Sunnie M. ;
Jones, Katherine A. .
CURRENT OPINION IN CELL BIOLOGY, 2008, 20 (03) :334-340
[5]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[6]   GTM-Based QSAR Models and Their Applicability Domains [J].
Gaspar, H. A. ;
Baskin, I. I. ;
Marcou, G. ;
Horvath, D. ;
Varnek, A. .
MOLECULAR INFORMATICS, 2015, 34 (6-7) :348-356
[7]   Chemical Data Visualization and Analysis with Incremental Generative Topographic Mapping: Big Data Challenge [J].
Gaspar, Helena A. ;
Baskin, Igor I. ;
Marcou, Gilles ;
Horvath, Dragos ;
Varnek, Alexandre .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (01) :84-94
[8]   ChEMBL: a large-scale bioactivity database for drug discovery [J].
Gaulton, Anna ;
Bellis, Louisa J. ;
Bento, A. Patricia ;
Chambers, Jon ;
Davies, Mark ;
Hersey, Anne ;
Light, Yvonne ;
McGlinchey, Shaun ;
Michalovich, David ;
Al-Lazikani, Bissan ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D1100-D1107
[9]  
H. 2R7 Chemical Computing Group Inc, 2016, 1010 SHERBR ST W SUI, V08
[10]   Chemistry-driven Hit-to-lead Optimization Guided by Structure-based Approaches [J].
Hoffer, Laurent ;
Muller, Christophe ;
Roche, Philippe ;
Morelli, Xavier .
MOLECULAR INFORMATICS, 2018, 37 (9-10)