Relationship between prediction accuracy and uncertainty in compound potency prediction using deep neural networks and control models

被引:2
作者
Roth, Jannik P. [1 ]
Bajorath, Juergen [1 ]
机构
[1] Rhein Friedrich Wilhelms Univ, Dept Life Sci Informat & Data Sci, LIMES Program Unit Chem Biol & Med Chem, B IT, Friedrich Hirzebruch Allee 5-6, D-53115 Bonn, Germany
关键词
Uncertainty quantification; Machine learning; Compound potency prediction; Prediction accuracy; QUANTIFICATION; RULES;
D O I
10.1038/s41598-024-57135-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The assessment of prediction variance or uncertainty contributes to the evaluation of machine learning models. In molecular machine learning, uncertainty quantification is an evolving area of research where currently no standard approaches or general guidelines are available. We have carried out a detailed analysis of deep neural network variants and simple control models for compound potency prediction to study relationships between prediction accuracy and uncertainty. For comparably accurate predictions obtained with models of different complexity, highly variable prediction uncertainties were detected using different metrics. Furthermore, a strong dependence of prediction characteristics and uncertainties on potency levels of test compounds was observed, often leading to over- or under-confident model decisions with respect to the expected variance of predictions. Moreover, neural network models responded very differently to training set modifications. Taken together, our findings indicate that there is only little, if any correlation between compound potency prediction accuracy and uncertainty, especially for deep neural network models, when predictions are assessed on the basis of currently used metrics for uncertainty quantification.
引用
收藏
页数:12
相关论文
共 47 条
  • [1] A review of uncertainty quantification in deep learning: Techniques, applications and challenges
    Abdar, Moloud
    Pourpanah, Farhad
    Hussain, Sadiq
    Rezazadegan, Dana
    Liu, Li
    Ghavamzadeh, Mohammad
    Fieguth, Paul
    Cao, Xiaochun
    Khosravi, Abbas
    Acharya, U. Rajendra
    Makarenkov, Vladimir
    Nahavandi, Saeid
    [J]. INFORMATION FUSION, 2021, 76 : 243 - 297
  • [2] Moss HB, 2020, Arxiv, DOI [arXiv:2010.01118, 10.48550/arXiv.2010.01118]
  • [3] New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays
    Baell, Jonathan B.
    Holloway, Georgina A.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2010, 53 (07) : 2719 - 2740
  • [4] State-of-the-art of artificial intelligence in medicinal chemistry
    Bajorath, Jurgen
    [J]. FUTURE SCIENCE OA, 2021, 7 (06):
  • [5] Principles and Practice of Explainable Machine Learning
    Belle, Vaishak
    Papantonis, Ioannis
    [J]. FRONTIERS IN BIG DATA, 2021, 4
  • [6] Rules for Identifying Potentially Reactive or Promiscuous Compounds
    Bruns, Robert F.
    Watson, Ian A.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2012, 55 (22) : 9763 - 9772
  • [7] Can we open the black box of AI?
    Castelvecchi D.
    [J]. Nature, 2016, 538 (7623) : 20 - 23
  • [8] Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout
    Cortes-Ciriano, Isidro
    Bender, Andreas
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (07) : 3330 - 3339
  • [9] Gal Y, 2016, Arxiv, DOI arXiv:1506.02142
  • [10] Strictly proper scoring rules, prediction, and estimation
    Gneiting, Tilmann
    Raftery, Adrian E.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) : 359 - 378