A Method for Measuring Similarity or Distance of Molecular and Arbitrary Graphs Based on a Collection of Topological Indices

被引:0
作者
Oz, Mert Sinan [1 ]
机构
[1] Bursa Tech Univ, Fac Engn & Nat Sci, Dept Math, Bursa, Turkiye
关键词
graph similarity measures; Jaccard/Tanimoto indices; molecular similarity measures; topological indices; ATOM-BOND CONNECTIVITY; EDIT DISTANCE; IRREGULARITY; DESCRIPTOR; QSAR;
D O I
10.1002/cem.70047
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The comparison of graphs using various types of quantitative structural similarity or distance measures has an important place in many scientific disciplines. Two of these are cheminformatics and chemical graph theory, in which the structural similarity or distance measures between molecular graphs are analyzed by calculating the Jaccard/Tanimoto index based on molecular fingerprints. A novel method is proposed to measure the structural similarity or distance for molecular and arbitrary graphs. This method calculates the Jaccard/Tanimoto index based on a collection of topological indices embedded in the entries of a vector. We statistically compare the proposed method with the method for calculating the Jaccard/Tanimoto indices based on five different molecular fingerprints on alkane and cycloalkane isomers. Furthermore, to explore how the method works on non-molecular graphs, we statistically analyze it on the set of all connected graphs with seven vertices. The Jaccard/Tanimoto index values produced by the proposed method cover the value domain. In addition, it provides a discrete similarity distribution with the clustering, which makes the differences clear and provides convenience for comparison. Two outstanding features of the proposed method are its applicability to arbitrary graphs and the computational complexity of the algorithm used in the method is polynomial over the number of graphs and the number of vertices and edges of the graphs.
引用
收藏
页数:15
相关论文
共 52 条
[1]  
Abdo H, 2014, DISCRETE MATH THEOR, V16, P201
[2]  
Albertson MO, 1997, ARS COMBINATORIA, V46, P219
[3]  
Bajusz D, 2017, COMPREHENSIVE MEDICINAL CHEMISTRY III, VOL 3: IN SILICO DRUG DISCOVERY TOOLS, P329, DOI 10.1016/B978-0-12-409547-2.12345-5
[4]   Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? [J].
Bajusz, David ;
Racz, Anita ;
Heberger, Kroly .
JOURNAL OF CHEMINFORMATICS, 2015, 7
[5]   HIGHLY DISCRIMINATING DISTANCE-BASED TOPOLOGICAL INDEX [J].
BALABAN, AT .
CHEMICAL PHYSICS LETTERS, 1982, 89 (05) :399-404
[6]  
Balakrishnan R., 2004, LINEAR ALGEBRA APPL, V387, P287, DOI [DOI 10.1088/1742-5468/2008/10/P10008, DOI 10.1016/J.LAA.2004.02.038]
[7]   A NOTE ON THE IRREGULARITY OF GRAPHS [J].
BELL, FK .
LINEAR ALGEBRA AND ITS APPLICATIONS, 1992, 161 :45-54
[8]   A graph distance metric based on the maximal common subgraph [J].
Bunke, H ;
Shearer, K .
PATTERN RECOGNITION LETTERS, 1998, 19 (3-4) :255-259
[9]   On a relation between graph edit distance and maximum common subgraph [J].
Bunke, H .
PATTERN RECOGNITION LETTERS, 1997, 18 (08) :689-694
[10]  
Bunke H., 1983, Bulletin of the European Association for Theoretical Computer Science, P35