FSM-DDTR: End-to-end feedback strategy for multi-objective De Novo drug design using transformers

被引:7
作者
Monteiro, Nelson R. C. [1 ]
Pereira, Tiago O. [1 ]
Machado, Ana Catarina D. [1 ]
Oliveira, Jose L. [2 ]
Abbasi, Maryam [1 ,3 ]
Arrais, Joel P. [1 ]
机构
[1] Univ Coimbra, Dept Informat Engn, Ctr Informat & Syst, Coimbra, Portugal
[2] Univ Aveiro, Dept Elect Telecommun & Informat, IEETA, Aveiro, Portugal
[3] Polytech Inst Coimbra, Appl Res Inst, Coimbra, Portugal
关键词
Drug design; SMILES; Deep learning; Transformer; Multi-objective optimization; DISCOVERY;
D O I
10.1016/j.compbiomed.2023.107285
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The design of compounds that target specific biological functions with relevant selectivity is critical in the context of drug discovery, especially due to the polypharmacological nature of most existing drug molecules. In recent years, in silico-based methods combined with deep learning have shown promising results in the de novo drug design challenge, leading to potential leads for biologically interesting targets. However, several of these methods overlook the importance of certain properties, such as validity rate and target selectivity, or simplify the generative process by neglecting the multi-objective nature of the pharmacological space. In this study, we propose a multi-objective Transformer-based architecture to generate drug candidates with desired molecular properties and increased selectivity toward a specific biological target. The framework consists of a Transformer-Decoder Generator that generates novel and valid compounds in the SMILES format notation, a Transformer-Encoder Predictor that estimates the binding affinity toward the biological target, and a feedback loop combined with a multi-objective optimization strategy to rank the generated molecules and condition the generating distribution around the targeted properties. The results demonstrate that the proposed architecture can generate novel and synthesizable small compounds with desired pharmacological properties toward a biologically relevant target. The unbiased Transformer-based Generator achieved superior performance in the novelty rate (97.38%) and comparable performance in terms of internal diversity, uniqueness, and validity against state-of-the-art baselines. The optimization of the unbiased Transformer-based Generator resulted in the generation of molecules exhibiting high binding affinity toward the Adenosine A2A Receptor (AA2AR) and possessing desirable physicochemical properties, where 99.36% of the generated molecules follow Lipinski's rule of five. Furthermore, the implementation of a feedback strategy, in conjunction with a multi-objective algorithm, effectively shifted the distribution of the generated molecules toward optimal values of molecular weight, molecular lipophilicity, topological polar surface area, synthetic accessibility score, and quantitative estimate of drug-likeness, without the necessity of prior training sets comprising molecules endowed with pharmacological properties of interest. Overall, this research study validates the applicability of a Transformer-based architecture in the context of drug design, capable of exploring the vast chemical representation space to generate novel molecules with improved pharmacological properties and target selectivity. The data and source code used in this study are available at: https://github.com/larngroup/FSM-DDTR.
引用
收藏
页数:17
相关论文
共 41 条
  • [1] Chemoselectivity and the Curious Reactivity Preferences of Functional Groups
    Afagh, Nicholas A.
    Yudin, Andrei K.
    [J]. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2010, 49 (02) : 262 - 310
  • [2] Adenosine A2A Receptor as a Potential Drug Target - Current Status and Future Perspectives
    Al-Attraqchi, Omar H. A.
    Attimarad, Mahesh
    Venugopala, Katharigatta N.
    Nair, Anroop
    Al-Attraqchi, Noor H. A.
    [J]. CURRENT PHARMACEUTICAL DESIGN, 2019, 25 (25) : 2716 - 2740
  • [3] MolGPT: Molecular Generation Using a Transformer-Decoder Model
    Bagal, Viraj
    Aggarwal, Rishal
    Vinod, P. K.
    Priyakumar, U. Deva
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (09) : 2064 - 2076
  • [4] Bickerton GR, 2012, NAT CHEM, V4, P90, DOI [10.1038/NCHEM.1243, 10.1038/nchem.1243]
  • [5] Bjerrum EJ, 2017, arXiv
  • [6] Chithrananda S., 2020, Chemberta: Large-scale self-supervised pretraining for molecular property prediction
  • [7] Bioavailability and bioequivalence in drug development
    Chow, Shein-Chung
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2014, 6 (04): : 304 - 312
  • [8] De Cao N, 2018, Arxiv, DOI [arXiv:1805.11973, DOI 10.48550/ARXIV.1805.11973]
  • [9] Deb K., 2000, Parallel Problem Solving from Nature PPSN VI. 6th International Conference. Proceedings (Lecture Notes in Computer Science Vol.1917), P849
  • [10] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171