Traditional Machine and Deep Learning for Predicting Toxicity Endpoints

被引:3
|
作者
Norinder, Ulf [1 ]
机构
[1] Stockholm Univ, Dept Comp & Syst Sci, S-16407 Kista, Sweden
来源
MOLECULES | 2023年 / 28卷 / 01期
关键词
CATMoS dataset; CDDD; BERT; conformal prediction; random forest; RDKit; LANGUAGE;
D O I
10.3390/molecules28010217
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Molecular structure property modeling is an increasingly important tool for predicting compounds with desired properties due to the expensive and resource-intensive nature and the problem of toxicity-related attrition in late phases during drug discovery and development. Lately, the interest for applying deep learning techniques has increased considerably. This investigation compares the traditional physico-chemical descriptor and machine learning-based approaches through autoencoder generated descriptors to two different descriptor-free, Simplified Molecular Input Line Entry System (SMILES) based, deep learning architectures of Bidirectional Encoder Representations from Transformers (BERT) type using the Mondrian aggregated conformal prediction method as overarching framework. The results show for the binary CATMoS non-toxic and very-toxic datasets that for the former, almost equally balanced, dataset all methods perform equally well while for the latter dataset, with an 11-fold difference between the two classes, the MolBERT model based on a large pre-trained network performs somewhat better compared to the rest with high efficiency for both classes (0.93-0.94) as well as high values for sensitivity, specificity and balanced accuracy (0.86-0.87). The descriptor-free, SMILES-based, deep learning BERT architectures seem capable of producing well-balanced predictive models with defined applicability domains. This work also demonstrates that the class imbalance problem is gracefully handled through the use of Mondrian conformal prediction without the use of over- and/or under-sampling, weighting of classes or cost-sensitive methods.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Evaluation of Machine Learning Classifiers for Predicting Deep Convection
    Ukkonen, Peter
    Makela, Antti
    JOURNAL OF ADVANCES IN MODELING EARTH SYSTEMS, 2019, 11 (06) : 1784 - 1802
  • [22] A comparative study of predicting high entropy alloy phase fractions with traditional machine learning and deep neural networks
    Liu, Shusen
    Bocklund, Brandon
    Diffenderfer, James
    Chaganti, Shreya
    Kailkhura, Bhavya
    McCall, Scott K.
    Gallagher, Brian
    Perron, Aurelien
    McKeown, Joseph T.
    NPJ COMPUTATIONAL MATERIALS, 2024, 10 (01)
  • [23] Systematic approaches to machine learning models for predicting pesticide toxicity
    Anandhi, Ganesan
    Iyapparaja, M.
    HELIYON, 2024, 10 (07)
  • [24] An overview of machine learning and deep learning techniques for predicting epileptic seizures
    Zurdo-Tabernero, Marco
    Canal-Alonso, Angel
    de la Prieta, Fernando
    Rodriguez, Sara
    Prieto, Javier
    Corchado, Juan Manuel
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2024, 20 (04)
  • [25] Predicting Dose-Range Chemical Toxicity using Novel Hybrid Deep Machine-Learning Method
    Limbu, Sarita
    Zakka, Cyril
    Dakshanamurthy, Sivanesan
    TOXICS, 2022, 10 (11)
  • [26] Computational models for predicting liver toxicity in the deep learning era
    Mostafa, Fahad
    Chen, Minjun
    FRONTIERS IN TOXICOLOGY, 2024, 5
  • [27] Electrocardiogram Monitoring and Interpretation: From Traditional Machine Learning to Deep Learning, and Their Combination
    Parvaneh, Saman
    Rubin, Jonathan
    2018 COMPUTING IN CARDIOLOGY CONFERENCE (CINC), 2018, 45
  • [28] A Comparison of Deep Learning vs Traditional Machine Learning for Electricity Price Forecasting
    O'Leary, Christian
    Lynch, Conor
    Bain, Rose
    Smith, Gary
    Grimes, Diarmuid
    2021 4TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTER TECHNOLOGIES (ICICT 2021), 2021, : 6 - 12
  • [29] Feature Extraction Based on Deep Learning for Some Traditional Machine Learning Methods
    Cayir, Aykut
    Yenidogan, Isil
    Dag, Hasan
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 494 - 497
  • [30] Classifying Multilingual User Feedback using Traditional Machine Learning and Deep Learning
    Stanik, Christoph
    Haering, Marlo
    Maalej, Walid
    2019 IEEE 27TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW 2019), 2019, : 220 - 226