Molecular pretraining models towards molecular property prediction

被引:0
作者
Qiao, Jianbo [1 ]
Gao, Wenjia [1 ]
Jin, Junru [1 ]
Wang, Ding [1 ]
Guo, Xu [1 ]
Manavalan, Balachandran [2 ]
Wei, Leyi [3 ,4 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China
[2] Sungkyunkwan Univ, Coll Biotechnol & Bioengn, Dept Integrat Biotechnol, Suwon 16419, South Korea
[3] Macao Polytech Univ, Fac Appl Sci, Macau 999078, Peoples R China
[4] Shandong Univ, Joint SDU NTU Ctr Artificial Intelligence Res C FA, Jinan 250101, Peoples R China
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
molecular pretraining models; molecular property prediction; graph neural network (GNN); graph Transformer; PubChem; MoleculeNet; DATABASE;
D O I
10.1007/s11432-024-4457-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Molecular property prediction plays a pivotal role in advancing our understanding of molecular representations, serving as a key driver for progress in drug discovery. Leveraging deep learning to gain comprehensive insights into molecular properties has become increasingly critical. Recent breakthroughs in molecular property prediction have been achieved through molecular pretraining models, which utilize large-scale databases of unlabeled molecules for pretraining, followed by fine-tuning for specific downstream tasks. These models enable a deeper understanding of molecular properties. In this study, we review recent advancements in molecular property prediction using molecular pretraining models. Our focus includes molecular descriptors, the impact of pretraining dataset size, molecular characterization model architectures, and the diversity of pretraining task types. Additionally, we compare the performance of existing methods and propose future directions to enhance the effectiveness of molecular pretraining models.
引用
收藏
页数:19
相关论文
共 95 条
[21]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[22]  
Hamilton WL, 2017, ADV NEUR IN, V30
[23]  
Hu WH, 2021, Arxiv, DOI [arXiv:2103.09430, 10.48550/arXiv.2103.09430]
[24]   Tox21 Challenge to Build Predictive Models of Nuclear Receptor and Stress Response Pathways as Mediated by Exposure to Environmental Chemicals and Drugs [J].
Huang, Ruili ;
Xia, Menghang ;
Nguyen, Dac-Trung ;
Zhao, Tongan ;
Sakamuru, Srilatha ;
Zhao, Jinghua ;
Shahane, Sampada A. ;
Rossoshek, Anna ;
Simeonov, Anton .
FRONTIERS IN ENVIRONMENTAL SCIENCE, 2016, 3
[25]   Predicting new drug indications based on double variational autoencoders [J].
Huang, Zhaoyang ;
Chen, Shengjian ;
Yu, Liang .
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 164
[26]   Chemformer: a pre-trained transformer for computational chemistry [J].
Irwin, Ross ;
Dimitriadis, Spyridon ;
He, Jiazhen ;
Bjerrum, Esben Jannik .
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (01)
[27]   QMugs, quantum mechanical properties of drug-like molecules [J].
Isert, Clemens ;
Atz, Kenneth ;
Jimenez-Luna, Jose ;
Schneider, Gisbert .
SCIENTIFIC DATA, 2022, 9 (01)
[28]  
Joshi M, 2024, Medinformatics
[29]   PubChem 2019 update: improved access to chemical data [J].
Kim, Sunghwan ;
Chen, Jie ;
Cheng, Tiejun ;
Gindulyte, Asta ;
He, Jia ;
He, Siqian ;
Li, Qingliang ;
Shoemaker, Benjamin A. ;
Thiessen, Paul A. ;
Yu, Bo ;
Zaslavsky, Leonid ;
Zhang, Jian ;
Bolton, Evan E. .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D1102-D1109
[30]   Detection of IUPAC and IUPAC-like chemical names [J].
Klinger, Roman ;
Kolarik, Corinna ;
Fluck, Juliane ;
Hofmann-Apitius, Martin ;
Friedrich, Christoph M. .
BIOINFORMATICS, 2008, 24 (13) :I268-I276