Relationship between prediction accuracy and uncertainty in compound potency prediction using deep neural networks and control models

被引：2

作者：

Roth, Jannik P. ^{[1
]}

Bajorath, Juergen ^{[1
]}

机构：

[1] Rhein Friedrich Wilhelms Univ, Dept Life Sci Informat & Data Sci, LIMES Program Unit Chem Biol & Med Chem, B IT, Friedrich Hirzebruch Allee 5-6, D-53115 Bonn, Germany

来源：

SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期

关键词：

Uncertainty quantification; Machine learning; Compound potency prediction; Prediction accuracy; QUANTIFICATION; RULES;

D O I：

10.1038/s41598-024-57135-6

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The assessment of prediction variance or uncertainty contributes to the evaluation of machine learning models. In molecular machine learning, uncertainty quantification is an evolving area of research where currently no standard approaches or general guidelines are available. We have carried out a detailed analysis of deep neural network variants and simple control models for compound potency prediction to study relationships between prediction accuracy and uncertainty. For comparably accurate predictions obtained with models of different complexity, highly variable prediction uncertainties were detected using different metrics. Furthermore, a strong dependence of prediction characteristics and uncertainties on potency levels of test compounds was observed, often leading to over- or under-confident model decisions with respect to the expected variance of predictions. Moreover, neural network models responded very differently to training set modifications. Taken together, our findings indicate that there is only little, if any correlation between compound potency prediction accuracy and uncertainty, especially for deep neural network models, when predictions are assessed on the basis of currently used metrics for uncertainty quantification.

引用

页数：12

共 47 条

[1] A review of uncertainty quantification in deep learning: Techniques, applications and challenges
Abdar, Moloud
Pourpanah, Farhad
Hussain, Sadiq
Rezazadegan, Dana
Liu, Li
Ghavamzadeh, Mohammad
Fieguth, Paul
Cao, Xiaochun
Khosravi, Abbas
Acharya, U. Rajendra
Makarenkov, Vladimir
Nahavandi, Saeid
[J]. INFORMATION FUSION, 2021, 76 : 243 - 297
[2] Moss HB, 2020, Arxiv, DOI [arXiv:2010.01118, 10.48550/arXiv.2010.01118]
[3] New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays
Baell, Jonathan B.
Holloway, Georgina A.
[J]. JOURNAL OF MEDICINAL CHEMISTRY, 2010, 53 (07) : 2719 - 2740
[4] State-of-the-art of artificial intelligence in medicinal chemistry
Bajorath, Jurgen
[J]. FUTURE SCIENCE OA, 2021, 7 (06):
[5] Principles and Practice of Explainable Machine Learning
Belle, Vaishak
Papantonis, Ioannis
[J]. FRONTIERS IN BIG DATA, 2021, 4
[6] Rules for Identifying Potentially Reactive or Promiscuous Compounds
Bruns, Robert F.
Watson, Ian A.
[J]. JOURNAL OF MEDICINAL CHEMISTRY, 2012, 55 (22) : 9763 - 9772
[7] Can we open the black box of AI?
Castelvecchi D.
[J]. Nature, 2016, 538 (7623) : 20 - 23
[8] Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout
Cortes-Ciriano, Isidro
Bender, Andreas
[J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (07) : 3330 - 3339
[9] Gal Y, 2016, Arxiv, DOI arXiv:1506.02142
[10] Strictly proper scoring rules, prediction, and estimation
Gneiting, Tilmann
Raftery, Adrian E.
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) : 359 - 378

← 1 2 3 4 5 →