MolBench: A Benchmark of AI Models for Molecular Property Prediction

被引:0
作者
Jiang, Xiuyu [1 ]
Tan, Liqin [1 ]
Cen, Jianhuan [1 ]
Zou, Qingsong [1 ,2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Peoples R China
来源
BENCHMARKING, MEASURING, AND OPTIMIZING, BENCH 2023 | 2024年 / 14521卷
基金
中国国家自然科学基金;
关键词
Molecular Property Prediction; Metric; Bench; MoleculeNet; FREE-ENERGIES; DATABASE;
D O I
10.1007/978-981-97-0316-6_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, there has been a growing demand for the prediction of complex molecular properties in the fields of drug design, material science, and biotechnology. Compared to traditional laboratory methods, the deep learning method has many advantages such as saving enormously time and money. The deep learning method achieves revolutionary success in predicting molecular properties and many models based on the deep learning method has been developed in this field. However, there still lacks reliable and multidimensional benchmarks for evaluating these artificial intelligence (AI) models. In this paper, we develop a general method to evaluate AI models for predicting molecular properties. More precisely, we design multiple evaluation metrics based on the MoleculeNet datasets and introduce an extensible API interface to benchmark three types of AI models: molecular fingerprint based models, graph-based models, and pre-trained models. The purpose of the work is to establish a fair and reliable benchmark for future innovation in the field of molecular property prediction, emphasizing the importance of multidimensional perspectives.
引用
收藏
页码:53 / 70
页数:18
相关论文
共 53 条
[1]  
Abdi Herve., 2010, ENCY RES DESIGN, P169
[2]  
Artemov A.V., 2016, Biochem
[3]   970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13 [J].
Blum, Lorenz C. ;
Reymond, Jean-Louis .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2009, 131 (25) :8732-+
[4]  
Breiman L., 2001, MACH LEARN, V45, P5
[5]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[6]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[7]   ESOL: Estimating aqueous solubility directly from molecular structure [J].
Delaney, JS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :1000-1005
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]   Trends in clinical success rates and therapeutic focus [J].
Dowden, Helen ;
Munro, Jamie .
NATURE REVIEWS DRUG DISCOVERY, 2019, 18 (07) :494-495
[10]  
Dunn A, 2020, NPJ COMPUT MATER, V6, DOI 10.1038/s41524-020-00406-3