A graph convolution network with subgraph embedding for mutagenic prediction in aromatic hydrocarbons

被引:8
作者
Moon, Hyung-Jun [1 ]
Bu, Seok-Jun [2 ]
Cho, Sung-Bae [2 ]
机构
[1] Yonsei Univ, Dept Artificial Intelligence, Seoul 03722, South Korea
[2] Yonsei Univ, Dept Comp Sci, Seoul 03722, South Korea
关键词
Mutagenic prediction; Deep learning; Graph convolution network; Graph partitioning algorithm; CLASSIFICATION; MODELS; AMINES; DNA;
D O I
10.1016/j.neucom.2023.01.091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An aromatic hydrocarbon refers to an organic material having a carbon ring such as benzene and a func-tional group in the carbon ring. As the industry develops, natural pollution becomes harsh, new com-pounds emerge, and the exposure to aromatic hydrocarbons is continuously increasing. Predicting mutagenicity is one of the crucial issues in reducing the risk because these organisms may have proper-ties that penetrate the DNA of living things to cause mutations. Recently, the accuracy of mutation pre-diction has improved due to the power of deep learning. However, most conventional methods do not consider the characteristics of molecular aromatic hydrocarbons, which dilutes local information and results in a severe deterioration of the prediction performance. In this paper, we propose a method of exploiting subgraph convolution neural networks that enables the extraction of local information of a graph by partitioning it to maintain the detailed information. For extracting the features of molecules, we use the Girvan Newman algorithm to partition the graph according to the carbon ring and functional group and obtain the embedding vectors of the subgraphs as well as the original graph with graph con-volution network (GCN). The embedding vectors are combined to represent the whole graph information and predict mutagenicity. Experiments with MUTAG, NCI1 and NCI109, datasets for predicting muta -genicity of molecules in graph structure, confirm that we successfully segment carbon rings and func-tional groups from molecular graphs and predict mutations using the partitioned graphs, leading to a 2 %p performance improvement. In addition, the proposed method has prevented about 15 %p of infor-mation dilution in GCN, and an analysis of the latent space of graphs reveals that the subgraphs extracted maintain the local information appropriately.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:60 / 68
页数:9
相关论文
共 45 条
  • [1] Prediction of mutagenicity of aromatic and heteroaromatic amines from structure: A hierarchical QSAR approach
    Basak, SC
    Mills, DR
    Balaban, AT
    Gute, BD
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (03): : 671 - 678
  • [2] Investigating the Generalizability of the MultiFlow ® DNA Damage Assay and Several Companion Machine Learning Models With a Set of 103 Diverse Test Chemicals
    Bryce, Steven M.
    Bernacki, Derek T.
    Smith-Roe, Stephanie L.
    Witt, Kristine L.
    Bemis, Jeffrey C.
    Dertinger, Stephen D.
    [J]. TOXICOLOGICAL SCIENCES, 2018, 162 (01) : 146 - 166
  • [3] Molecular fingerprint similarity search in virtual screening
    Cereto-Massague, Adria
    Jose Ojeda, Maria
    Valls, Cristina
    Mulero, Miguel
    Garcia-Vallve, Santiago
    Pujadas, Gerard
    [J]. METHODS, 2015, 71 : 58 - 63
  • [4] Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks
    Chiang, Wei-Lin
    Liu, Xuanqing
    Si, Si
    Li, Yang
    Bengio, Samy
    Hsieh, Cho-Jui
    [J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 257 - 266
  • [5] Evolutionary learning of modular neural networks with genetic programming
    Cho, SB
    Shimohara, K
    [J]. APPLIED INTELLIGENCE, 1998, 9 (03) : 191 - 200
  • [6] Outdoor air pollution, green space, and cancer incidence in Saxony: a semi-individual cohort study
    Datzmann, Thomas
    Markevych, Iana
    Trautmann, Freya
    Heinrich, Joachim
    Schmitt, Jochen
    Tesch, Falko
    [J]. BMC PUBLIC HEALTH, 2018, 18
  • [7] STRUCTURE ACTIVITY RELATIONSHIP OF MUTAGENIC AROMATIC AND HETEROAROMATIC NITRO-COMPOUNDS - CORRELATION WITH MOLECULAR-ORBITAL ENERGIES AND HYDROPHOBICITY
    DEBNATH, AK
    DECOMPADRE, RLL
    DEBNATH, G
    SHUSTERMAN, AJ
    HANSCH, C
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 1991, 34 (02) : 786 - 797
  • [8] In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach
    Ding, Yi-Lung
    Lyu, You-Chen
    Leong, Max K.
    [J]. TOXICOLOGY IN VITRO, 2017, 40 : 102 - 114
  • [9] Duvenaudt D, 2015, ADV NEUR IN, V28
  • [10] Gilmer J, 2017, PR MACH LEARN RES, V70