PharmKi: A Retrieval System of Chemical Structural Formula Based on Graph Similarity

被引:1
作者
Qu, Jingwei [1 ]
Lu, Xiaoqing [1 ,2 ]
Zhang, Chengcui [3 ]
Sun, Penghui [1 ]
Wang, Bei [1 ]
Tang, Zhi [1 ,2 ]
机构
[1] Peking Univ, Inst Comp Sci & Technol, Beijing, Peoples R China
[2] State Key Lab Digital Publishing Technol, Beijing, Peoples R China
[3] Univ Alabama Birmingham, Dept Comp Sci, Birmingham, AL USA
来源
IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018) | 2018年
基金
中国国家自然科学基金;
关键词
chemical structural formula; multimedia information retrieval; frequent subgraph mining; graph isomorphism; SUBSTRUCTURE; SEARCH;
D O I
10.1109/MIPR.2018.00016
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Different from conventional media type, chemical structural formula (CSF) is a primary search target as a unique identifier for each compound in the research field of medical information retrieval. This paper introduces a graph-based CSF retrieval system, PharmKi, accepting the photos taken from smartphones and the sketches drawn on tablet PCs as inputs. To establish a compact yet efficient hypergraph representation for molecules, we propose a graph-isomorphism-based algorithm for evaluating the spatial similarity among graphical CSFs, as well as selecting dominant acyclic subgraphs on the basis of overlapping analysis. The results of comparative study demonstrate that the proposed method outperforms the existing methods with regard to accuracy and efficiency.
引用
收藏
页码:45 / 50
页数:6
相关论文
共 21 条
[1]   Predicting activity approach based on new atoms similarity kernel function [J].
Abu El-Atta, Ahmed H. ;
Moussa, M. I. ;
Hassanien, Aboul Ella .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2015, 60 :55-62
[2]  
[Anonymous], 2006, RDKIT OPEN SOURCE CH
[3]   Bypassing the Kohn-Sham equations with machine learning [J].
Brockherde, Felix ;
Vogt, Leslie ;
Li, Li ;
Tuckerman, Mark E. ;
Burke, Kieron ;
Mueller, Klaus-Robert .
NATURE COMMUNICATIONS, 2017, 8
[4]   A (sub)graph isomorphism algorithm for matching large graphs [J].
Cordella, LP ;
Foggia, P ;
Sansone, C ;
Vento, M .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (10) :1367-1372
[5]   Frequent substructure-based approaches for classifying chemical compounds [J].
Deshpande, M ;
Kuramochi, M ;
Wale, N ;
Karypis, G .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (08) :1036-1050
[6]   Wikipedia Chemical Structure Explorer: substructure and similarity searching of molecules from Wikipedia [J].
Ertl, Peter ;
Patiny, Luc ;
Sander, Thomas ;
Rufener, Christian ;
Zasso, Michael .
JOURNAL OF CHEMINFORMATICS, 2015, 7
[7]   TOPOLOGICAL APPROACH TO DRUG DESIGN [J].
GALVEZ, J ;
GARCIADOMENECH, R ;
DEJULIANORTIZ, JV ;
SOLER, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1995, 35 (02) :272-284
[8]   Graph embedding in vector spaces by node attribute statistics [J].
Gibert, Jaume ;
Valveny, Ernest ;
Bunke, Horst .
PATTERN RECOGNITION, 2012, 45 (09) :3072-3083
[9]   Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies [J].
Hansen, Katja ;
Montavon, Gregoire ;
Biegler, Franziska ;
Fazli, Siamac ;
Rupp, Matthias ;
Scheffler, Matthias ;
von Lilienfeld, O. Anatole ;
Tkatchenko, Alexandre ;
Mueller, Klaus-Robert .
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2013, 9 (08) :3404-3419
[10]  
Horv Tamas, 2004, P 10 ACM SIGKDD INT, P158