Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research

被引:34
作者
David, Laurianne [1 ,2 ]
Arus-Pous, Josep [1 ,3 ]
Karlsson, Johan [4 ]
Engkvist, Ola [1 ]
Bjerrum, Esben Jannik [1 ]
Kogej, Thierry [1 ]
Kriegl, Jan M. [5 ]
Beck, Bernd [5 ]
Chen, Hongming [1 ,6 ]
机构
[1] AstraZeneca, Biopharmaceut R&D, Discovery Sci, Hit Discovery, Gothenburg, Sweden
[2] Rhein Friedrich Wilhelms Univ Bonn, Dept Life Sci Informat, B IT, Bonn, Germany
[3] Univ Bern, Dept Chem & Biochem, Bern, Switzerland
[4] AstraZeneca, Biopharmaceut R&D, Discovery Sci, Quantitat Biol, Gothenburg, Sweden
[5] Boehringer Ingelheim Pharma GmbH & Co KG, Dept Med Chem, Biberach, Germany
[6] Chem & Chem Biol Ctr, Guangzhou Regenerat Med & Hlth Guangdong Lab, Guangzhou, Guangdong, Peoples R China
基金
欧盟地平线“2020”;
关键词
Artificial intelligence; deep learning; Chemogenomics; Large-scale data; pharmaceutical industry; INTERFERENCE COMPOUNDS PAINS; HUMAN-GENOME-PROJECT; DRUG DISCOVERY; ASSAY INTERFERENCE; SCREENING LIBRARIES; TARGET PREDICTION; MICROSCOPY IMAGES; CONNECTIVITY MAP; SMALL MOLECULES; DESIGN;
D O I
10.3389/fphar.2019.01303
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
In recent years, the development of high-throughput screening (HTS) technologies and their establishment in an industrialized environment have given scientists the possibility to test millions of molecules and profile them against a multitude of biological targets in a short period of time, generating data in a much faster pace and with a higher quality than before. Besides the structure activity data from traditional bioassays, more complex assays such as transcriptomics profiling or imaging have also been established as routine profiling experiments thanks to the advancement of Next Generation Sequencing or automated microscopy technologies. In industrial pharmaceutical research, these technologies are typically established in conjunction with automated platforms in order to enable efficient handling of screening collections of thousands to millions of compounds. To exploit the ever-growing amount of data that are generated by these approaches, computational techniques are constantly evolving. In this regard, artificial intelligence technologies such as deep learning and machine learning methods play a key role in cheminformatics and bio-image analytics fields to address activity prediction, scaffold hopping, de novo molecule design, reaction/retrosynthesis predictions, or high content screening analysis. Herein we summarize the current state of analyzing large-scale compound data in industrial pharmaceutical research and describe the impact it has had on the drug discovery process over the last two decades, with a specific focus on deep-learning technologies.
引用
收藏
页数:16
相关论文
共 175 条
[1]   Advanced biological and chemical discovery (ABCD): Centralizing discovery knowledge in an inherently decentralized world [J].
Agrafiotis, Dimitris K. ;
Alex, Simson ;
Dai, Heng ;
Derkinderen, An ;
Farnum, Michael ;
Gates, Peter ;
Izrailev, Sergei ;
Jaeger, Edward P. ;
Konstant, Paul ;
Leung, Albert ;
Lobanov, Victor S. ;
Marichal, Patrick ;
Martin, Douglas ;
Rassokhin, Dmitrii N. ;
Shemanarev, Maxim ;
Skalkin, Andrew ;
Stong, John ;
Tabruyn, Tom ;
Vermeiren, Marleen ;
Wan, Jackson ;
Xu, Xiang Yang ;
Yao, Xiang .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (06) :1999-2014
[2]  
[Anonymous], 2017, ARXIV PREPRINT ARXIV
[3]  
Arus-Pous J., 2019, RANDOMIZED SMILES ST, DOI [10.26434/chemrxiv.8639942.v2, DOI 10.26434/CHEMRXIV.8639942.V2]
[4]   Exploring the GDB-13 chemical space using deep generative models [J].
Arus-Pous, Josep ;
Blaschke, Thomas ;
Ulander, Silas ;
Reymond, Jean-Louis ;
Chen, Hongming ;
Engkvist, Ola .
JOURNAL OF CHEMINFORMATICS, 2019, 11 (1)
[5]   Seven Year Itch: Pan-Assay Interference Compounds (PAINS) in 2017-Utility and Limitations [J].
Baell, Jonathan B. ;
Nissink, J. Willem M. .
ACS CHEMICAL BIOLOGY, 2018, 13 (01) :36-44
[6]   New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays [J].
Baell, Jonathan B. ;
Holloway, Georgina A. .
JOURNAL OF MEDICINAL CHEMISTRY, 2010, 53 (07) :2719-2740
[7]   SKIN SENSITIZATION STRUCTURE-ACTIVITY-RELATIONSHIPS FOR PHENYL BENZOATES [J].
BARRATT, MD ;
BASKETTER, DA ;
ROBERTS, DW .
TOXICOLOGY IN VITRO, 1994, 8 (04) :823-826
[8]   The impact of data integrity on decision making in early lead discovery [J].
Beck, Bernd ;
Seeliger, Daniel ;
Kriegl, Jan M. .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2015, 29 (09) :911-921
[9]   BioProfile-Extract knowledge from corporate databases to assess cross-reactivities of compounds [J].
Beck, Bernd .
BIOORGANIC & MEDICINAL CHEMISTRY, 2012, 20 (18) :5428-5435
[10]   The beautiful cell: high-content screening in drug discovery [J].
Bickle, Marc .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2010, 398 (01) :219-226