Exploring Fragment Adding Strategies to Enhance Molecule Pretraining in AI-Driven Drug Discovery

被引:5
作者
Meng, Zhaoxu [1 ]
Chen, Cheng [2 ]
Zhang, Xuan [2 ]
Zhao, Wei [3 ]
Cui, Xuefeng [2 ]
机构
[1] Shandong Univ, Sch Life Sci, Qingdao 266237, Peoples R China
[2] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266237, Peoples R China
[3] Shandong Univ, State Key Lab Microbial Technol, Qingdao 266237, Peoples R China
来源
BIG DATA MINING AND ANALYTICS | 2024年 / 7卷 / 03期
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Drugs; Task analysis; Databases; Training; Chemicals; Vocabulary; Fingerprint recognition; pretraining; information retrieval; drug discovery; virtual screening; molecule property prediction;
D O I
10.26599/BDMA.2024.9020003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The effectiveness of Al-driven drug discovery can be enhanced by pretraining on small molecules. However, the conventional masked language model pretraining techniques are not suitable for molecule pretraining due to the limited vocabulary size and the non-sequential structure of molecules. To overcome these challenges, we propose FragAdd, a strategy that involves adding a chemically implausible molecular fragment to the input molecule. This approach allows for the incorporation of rich local information and the generation of a high-quality graph representation, which is advantageous for tasks like virtual screening. Consequently, we have developed a virtual screening protocol that focuses on identifying estrogen receptor alpha binders on a nucleus receptor. Our results demonstrate a significant improvement in the binding capacity of the retrieved molecules. Additionally, we demonstrate that the FragAdd strategy can be combined with other self-supervised methods to further expedite the drug discovery process.
引用
收藏
页码:565 / 576
页数:12
相关论文
共 38 条
  • [1] Geometric deep learning on molecular representations
    Atz, Kenneth
    Grisoni, Francesca
    Schneider, Gisbert
    [J]. NATURE MACHINE INTELLIGENCE, 2021, 3 (12) : 1023 - 1032
  • [2] Computer-Aided Ligand Discovery for Estrogen Receptor Alpha
    Bafna, Divya
    Ban, Fuqiang
    Rennie, Paul S.
    Singh, Kriti
    Cherkasov, Artem
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (12) : 1 - 49
  • [3] Beck H, 2022, DRUG DISCOV TODAY, V27, P1560, DOI [10.1016/j.drudis.2022.02.015This, 10.1016/j.drudis.2022.02.015]
  • [4] Transfer Learning for Drug Discovery
    Cai, Chenjing
    Wang, Shiwei
    Xu, Youjun
    Zhang, Weilin
    Tang, Ke
    Ouyang, Qi
    Lai, Luhua
    Pei, Jianfeng
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2020, 63 (16) : 8683 - 8694
  • [5] Dassault Systemes, 2023, BIOVIA discovery studio visualizer
  • [6] On the Art of Compiling and Using 'Drug-Like' Chemical Fragment Spaces
    Degen, Joerg
    Wegscheid-Gerlach, Christof
    Zaliani, Andrea
    Rarey, Matthias
    [J]. CHEMMEDCHEM, 2008, 3 (10) : 1503 - 1507
  • [7] DeLano WL., 2002, CCP4 Newsl. Protein Crystallogr, V40, P82, DOI DOI 10.1038/S41598-017-03842-2
  • [8] Self-Supervised Representation Learning: Introduction, advances, and challenges
    Ericsson, Linus
    Gouk, Henry
    Loy, Chen Change
    Hospedales, Timothy M.
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2022, 39 (03) : 42 - 62
  • [9] A Data-Driven Approach to Predicting Successes and Failures of Clinical Trials
    Gayvert, Kaitlyn M.
    Madhukar, Neel S.
    Elemento, Olivier
    [J]. CELL CHEMICAL BIOLOGY, 2016, 23 (10) : 1294 - 1301
  • [10] Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking
    Gentile, Francesco
    Yaacoub, Jean Charle
    Gleave, James
    Fernandez, Michael
    Ton, Anh-Tien
    Ban, Fuqiang
    Stern, Abraham
    Cherkasov, Artem
    [J]. NATURE PROTOCOLS, 2022, 17 (03) : 672 - +