Complete mining of frequent patterns from graphs: Mining graph data

被引:149
作者
Inokuchi, A [1 ]
Washio, T [1 ]
Motoda, H [1 ]
机构
[1] Osaka Univ, Inst Sci & Ind Res, Osaka 5670047, Japan
关键词
data mining; graph data; Apriori algorithm; adjacency matrix; Web browsing analysis; chemical carcinogenesis analysis;
D O I
10.1023/A:1021726221443
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Basket Analysis, which is a standard method for data mining, derives frequent itemsets from database. However, its mining ability is limited to transaction data consisting of items. In reality, there are many applications where data are described in a more structural way, e. g. chemical compounds and Web browsing history. There are a few approaches that can discover characteristic patterns from graph-structured data in the field of machine learning. However, almost all of them are not suitable for such applications that require a complete search for all frequent subgraph patterns in the data. In this paper, we propose a novel principle and its algorithm that derive the characteristic patterns which frequently appear in graph-structured data. Our algorithm can derive all frequent induced subgraphs from both directed and undirected graph structured data having loops (including self-loops) with labeled or unlabeled nodes and links. Its performance is evaluated through the applications to Web browsing pattern analysis and chemical carcinogenesis analysis.
引用
收藏
页码:321 / 354
页数:34
相关论文
共 27 条
[1]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[2]  
Agrawal R., 1994, P 20 INT C VER LARG, V1215, P487
[3]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[4]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[5]  
Biggs N., 1974, ALGEBRAIC GRAPH THEO
[6]   Efficient data mining for path traversal patterns [J].
Chen, MS ;
Park, JS ;
Yu, PS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1998, 10 (02) :209-221
[7]   Substructure Discovery Using Minimum Description Length and Background Knowledge [J].
Cook, Diane J. ;
Holder, Lawrence B. .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1993, 1 :231-255
[8]   STRUCTURE ACTIVITY RELATIONSHIP OF MUTAGENIC AROMATIC AND HETEROAROMATIC NITRO-COMPOUNDS - CORRELATION WITH MOLECULAR-ORBITAL ENERGIES AND HYDROPHOBICITY [J].
DEBNATH, AK ;
DECOMPADRE, RLL ;
DEBNATH, G ;
SHUSTERMAN, AJ ;
HANSCH, C .
JOURNAL OF MEDICINAL CHEMISTRY, 1991, 34 (02) :786-797
[9]  
Dehaspe L., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P30
[10]  
DERAEDT L, 2001, P 17 INT JOINT C ART, V2, P853