Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity

被引:3
|
作者
Kalian, Alexander D. [1 ]
Benfenati, Emilio [2 ]
Osborne, Olivia J. [3 ]
Gott, David [3 ]
Potter, Claire [3 ]
Dorne, Jean-Lou C. M. [4 ]
Guo, Miao [5 ]
Hogstrand, Christer [6 ]
机构
[1] Kings Coll London, Dept Nutr Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
[2] Ist Ric Farmacolog Mario Negri IRCCS, Via Mario Negri 2, I-20156 Milan, Italy
[3] Food Stand Agcy, 70 Petty France, London SW1H 9EX, England
[4] European Food Safety Author EFSA, Via Carlo Magno 1A, I-43126 Parma, Italy
[5] Kings Coll London, Dept Engn, Strand Campus, London WC2R 2LS, England
[6] Kings Coll London, Dept Analyt Environm & Forens Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
基金
英国生物技术与生命科学研究理事会;
关键词
QSAR; dimensionality reduction; deep learning; autoencoder; principal component analysis; locally linear embedding; grid search; hyperparameter optimisation; mutagenicity; cheminformatics;
D O I
10.3390/toxics11070572
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover's theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Machine learning-driven QSAR models for predicting the cytotoxicity of five common microplastics
    Liu, Chengzhi
    Zong, Cheng
    Chen, Shuang
    Chu, Jiangliang
    Yang, Yifan
    Pan, Yong
    Yuan, Beilei
    Zhang, Huazhong
    TOXICOLOGY, 2024, 508
  • [42] Comparison of dimensionality reduction techniques for cross-source transfer of fluorescence contaminant detection models
    Li, Ziyu
    Peleato, Nicolas M.
    CHEMOSPHERE, 2021, 276
  • [43] Machine learning using Bernoulli mixture models: Clustering, rule extraction and dimensionality reduction
    Saeed, Mehreen
    Javed, Kashif
    Babri, Haroon Atique
    NEUROCOMPUTING, 2013, 119 : 366 - 374
  • [44] Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques
    Nayak, Subrat Kumar
    Nayak, Ajit Kumar
    Mishra, Smitaprava
    Mohanty, Prithviraj
    Tripathy, Nrusingha
    Chaudhury, Kumar Surjeet
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2025, 16 (01) : 53 - 64
  • [45] Exploring Deep Transfer Learning Techniques for Alzheimer's Dementia Detection
    Zhu, Youxiang
    Liang, Xiaohui
    Batsis, John A.
    Roth, Robert M.
    FRONTIERS IN COMPUTER SCIENCE, 2021, 3
  • [46] Exploring Video Captioning Techniques: A Comprehensive Survey on Deep Learning Methods
    Islam S.
    Dash A.
    Seum A.
    Raj A.H.
    Hossain T.
    Shah F.M.
    SN Computer Science, 2021, 2 (2)
  • [47] Interpretation and visualization techniques for deep learning models in medical imaging
    Huff, Daniel T.
    Weisman, Amy J.
    Jeraj, Robert
    PHYSICS IN MEDICINE AND BIOLOGY, 2021, 66 (04)
  • [48] AI-Driven Deep Learning Techniques in Protein Structure Prediction
    Chen, Lingtao
    Li, Qiaomu
    Nasif, Kazi Fahim Ahmad
    Xie, Ying
    Deng, Bobin
    Niu, Shuteng
    Pouriyeh, Seyedamin
    Dai, Zhiyu
    Chen, Jiawei
    Xie, Chloe Yixin
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (15)
  • [49] Traffic flow prediction models - A review of deep learning techniques
    Kashyap, Anirudh Ameya
    Raviraj, Shravan
    Devarakonda, Ananya
    Shamanth, R.
    Santhosh, K. V.
    Bhat, Soumya J.
    COGENT ENGINEERING, 2022, 9 (01):
  • [50] A novel dimensionality reduction and optimal deep learning based intrusion detection system for internet of things
    Ponniah, Krishna Kumar
    Retnaswamy, Bharathi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (03) : 4737 - 4751