Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity

被引:3
|
作者
Kalian, Alexander D. [1 ]
Benfenati, Emilio [2 ]
Osborne, Olivia J. [3 ]
Gott, David [3 ]
Potter, Claire [3 ]
Dorne, Jean-Lou C. M. [4 ]
Guo, Miao [5 ]
Hogstrand, Christer [6 ]
机构
[1] Kings Coll London, Dept Nutr Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
[2] Ist Ric Farmacolog Mario Negri IRCCS, Via Mario Negri 2, I-20156 Milan, Italy
[3] Food Stand Agcy, 70 Petty France, London SW1H 9EX, England
[4] European Food Safety Author EFSA, Via Carlo Magno 1A, I-43126 Parma, Italy
[5] Kings Coll London, Dept Engn, Strand Campus, London WC2R 2LS, England
[6] Kings Coll London, Dept Analyt Environm & Forens Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
基金
英国生物技术与生命科学研究理事会;
关键词
QSAR; dimensionality reduction; deep learning; autoencoder; principal component analysis; locally linear embedding; grid search; hyperparameter optimisation; mutagenicity; cheminformatics;
D O I
10.3390/toxics11070572
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover's theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Deep Learning in Exploring Semantic Relatedness for Microblog Dimensionality Reduction
    Xu, Lei
    Jiang, Chunxiao
    Ren, Yong
    2015 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2015, : 98 - 102
  • [2] Short-term Power Forecasting Model Based on Dimensionality Reduction and Deep Learning Techniques for Smart Grid
    Syed, Dabeeruddin
    Refaat, Shady S.
    Abu-Rub, Haitham
    Bouhali, Othmane
    2020 IEEE KANSAS POWER AND ENERGY CONFERENCE (KPEC), 2020,
  • [3] Exploring Dimensionality Reduction Techniques in Multilingual Transformers
    Álvaro Huertas-García
    Alejandro Martín
    Javier Huertas-Tato
    David Camacho
    Cognitive Computation, 2023, 15 : 590 - 612
  • [4] Exploring Dimensionality Reduction Techniques in Multilingual Transformers
    Huertas-Garcia, Alvaro
    Martin, Alejandro
    Huertas-Tato, Javier
    Camacho, David
    COGNITIVE COMPUTATION, 2023, 15 (02) : 590 - 612
  • [5] Textual data dimensionality reduction - a deep learning approach
    Neetu Kushwaha
    Millie Pant
    Multimedia Tools and Applications, 2020, 79 : 11039 - 11050
  • [6] Exploring Dimensionality Reduction Techniques for Efficient Surrogate-Assisted Optimization
    Ullah, Sibghat
    Duc Anh Nguyen
    Wang, Hao
    Menzel, Stefan
    Sendhoff, Bernhard
    Baeck, Thomas
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2965 - 2974
  • [7] Textual data dimensionality reduction-a deep learning approach
    Kushwaha, Neetu
    Pant, Millie
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (15-16) : 11039 - 11050
  • [8] Online Reviews Analysis for Customer Segmentation through Dimensionality Reduction and Deep Learning Techniques
    Nilashi, Mehrbakhsh
    Samad, Sarminah
    Minaei-Bidgoli, Behrouz
    Ghabban, Fahad
    Supriyanto, Eko
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (09) : 8697 - 8709
  • [9] Online Reviews Analysis for Customer Segmentation through Dimensionality Reduction and Deep Learning Techniques
    Mehrbakhsh Nilashi
    Sarminah Samad
    Behrouz Minaei-Bidgoli
    Fahad Ghabban
    Eko Supriyanto‬
    Arabian Journal for Science and Engineering, 2021, 46 : 8697 - 8709
  • [10] Deep feature extraction, dimensionality reduction, and classification of medical images using combined deep learning architectures, autoencoder, and multiple machine learning models
    Kiraz, Ahmet Hidayet
    Djibrillah, Fatime Oumar
    Yuksel, Mehmet Emin
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2023, 31 (06) : 1113 - 1128