Measuring Similarity between Graphs Based on the Levenshtein Distance

被引:17
作者
Cao, Bin [1 ]
Li, Ying [1 ]
Yin, Jianwei [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
来源
APPLIED MATHEMATICS & INFORMATION SCIENCES | 2013年 / 7卷
基金
中国国家自然科学基金;
关键词
Graph matching; similarity; depth-first search (DFS); Levenshtein distance;
D O I
10.12785/amis/071L24
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Graph data has been commonly used and widely researched both in academia and industry for many applications. And measuring similarity between graphs (i.e., graph matching) is the essential step for graph searching, pattern recognition and machine vision. At present, the most widely used approach to address the graph matching problem is graph edit distance (GED). However, the computation complexity of GED is expensive and it takes unacceptable time when the graph becomes larger. Generally, graph could be canonical labeled by some sort of strings and we use the depth-first search (DFS) code as our canonical labeling system. Based on DFS codes, combining the Levenshtein distance (i.e., string edit distance, SED), we proposed a novel method for similarity measurement of graphs. Processing and calculating the distance between two DFS codes, we turned the graph matching problem into string matching, which gains great improvement on the matching performance. The experimental results prove its usefulness.
引用
收藏
页码:169 / 175
页数:7
相关论文
共 11 条
[1]   Graph structure in the Web [J].
Broder, A ;
Kumar, R ;
Maghoul, F ;
Raghavan, P ;
Rajagopalan, S ;
Stata, R ;
Tomkins, A ;
Wiener, J .
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING, 2000, 33 (1-6) :309-320
[2]   A graph distance metric based on the maximal common subgraph [J].
Bunke, H ;
Shearer, K .
PATTERN RECOGNITION LETTERS, 1998, 19 (3-4) :255-259
[3]   On a relation between graph edit distance and maximum common subgraph [J].
Bunke, H .
PATTERN RECOGNITION LETTERS, 1997, 18 (08) :689-694
[4]  
Dijkman R., 2009, BPM
[5]  
Levenshtein V.I., 1966, Soviet Physics Doklady
[6]   RASCAL: Calculation of graph similarity using maximum common edge subgraphs [J].
Raymond, JW ;
Gardiner, EJ ;
Willett, P .
COMPUTER JOURNAL, 2002, 45 (06) :631-644
[7]   SAGA: a subgraph matching tool for biological graphs [J].
Tian, Yuanyuan ;
McEachin, Richard C. ;
Santos, Carlos ;
States, David J. ;
Patel, Jignesh M. .
BIOINFORMATICS, 2007, 23 (02) :232-239
[8]   Chemical similarity searching [J].
Willett, P ;
Barnard, JM ;
Downs, GM .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (06) :983-996
[9]  
Yan XF, 2002, 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P721, DOI 10.1109/ICDM.2002.1184038
[10]   Feature-based similarity search in graph structures [J].
Yan, Xifeng ;
Zhu, Feida ;
Yu, Philip S. ;
Han, Jiawei .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2006, 31 (04) :1418-1453