MOSubdue: a Pareto dominance-based multiobjective Subdue algorithm for frequent subgraph mining

被引:0
作者
Prakash Shelokar
Arnaud Quirin
Óscar Cordón
机构
[1] European Centre for Soft Computing,Department of Computer Science and Artificial Intelligence (DECSAI) and the Research Centre on Information and Communication Technologies (CITIC
[2] University of Granada,UGR)
来源
Knowledge and Information Systems | 2013年 / 34卷
关键词
Graph-based data mining; Frequent subgraph mining; Subdue; Gaston; Multiobjective graph-based data mining; Pareto-based multiobjective optimization; Evolutionary multiobjective optimization;
D O I
暂无
中图分类号
学科分类号
摘要
Graph-based data mining approaches have been mainly proposed to the task popularly known as frequent subgraph mining subject to a single user preference, like frequency, size, etc. In this work, we propose to deal with the frequent subgraph mining problem from multiobjective optimization viewpoint, where a subgraph (or solution) is defined by several user-defined preferences (or objectives), which are conflicting in nature. For example, mined subgraphs with high frequency are often of small size, and vice-versa. Use of such objectives in the multiobjective subgraph mining process generates Pareto-optimal subgraphs, where no subgraph is better than another subgraph in all objectives. We have applied a Pareto dominance approach for the evaluation and search subgraphs regarding to both proximity and diversity in multiobjective sense, which has incorporated in the framework of Subdue algorithm for subgraph mining. The method is called multiobjective subgraph mining by Subdue (MOSubdue) and has several advantages: (i) generation of Pareto-optimal subgraphs in a single run (ii) selection of subgraph-seeds from the candidate subgraphs based on all objectives (iii) search in the multiobjective subgraphs lattice space, and (iv) capability to deal with different multiobjective frequent subgraph mining tasks by customizing the tackled objectives. The good performance of MOSubdue is shown by performing multiobjective subgraph mining defined by two and three objectives on two real-life datasets.
引用
收藏
页码:75 / 108
页数:33
相关论文
共 68 条
  • [1] Cook DJ(2000)Graph-based data mining IEEE Intell Syst 15 32-41
  • [2] Holder LB(1996)Scalable discovery of informative structural concepts using domain knowledge IEEE Expert Intell Syst Appl 11 59-68
  • [3] Cook DJ(2002)A fast and elitist multiobjective genetic algorithm: NSGA-II IEEE Trans Evol Comput 6 182-197
  • [4] Holder LB(2004)A new technique for building maps of large scientific domains based on the cocitation of classes and categories Scientometrics 61 129-145
  • [5] Djoko S(2005)Mining coherent dense subgraphs acrossmassive biological networks for functional discovery Bioinformatics 21 i213-i221
  • [6] Deb K(2008)Pareto-based multi-objective machine learning: an overview and case studies IEEE Trans Syst Man Cybern C 38 397-415
  • [7] Pratap A(2011)Pegasus: mining peta-scale graphs Know Inf Syst 27 303-325
  • [8] Agarwal S(2004)An efficient algorithm for discovering frequent subgraphs IEEE Trans Knowl Data Eng 16 1038-1051
  • [9] Meyarivan T(2010)A general framework for relation graph clustering Knowl Inf Syst 24 393-413
  • [10] De Moya-Anegón F(2010)Partitioning large networks without breaking communities Know Inf Syst 25 345-369