Simplification of networks by edge pruning

被引:30
作者
Zhou, Fang [1 ]
Mahler, Sébastien [1 ]
Toivonen, Hannu [1 ]
机构
[1] Department of Computer Science, HIIT, University of Helsinki
来源
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2012年 / 7250卷
关键词
Bioinformatics - Data visualization - Graphic methods - Data mining - Semantics;
D O I
10.1007/978-3-642-31830-6_13
中图分类号
学科分类号
摘要
We propose a novel problem to simplify weighted graphs by pruning least important edges from them. Simplified graphs can be used to improve visualization of a network, to extract its main structure, or as a pre-processing step for other data mining algorithms. We define a graph connectivity function based on the best paths between all pairs of nodes. Given the number of edges to be pruned, the problem is then to select a subset of edges that best maintains the overall graph connectivity. Our model is applicable to a wide range of settings, including probabilistic graphs, flow graphs and distance graphs, since the path quality function that is used to find best paths can be defined by the user. We analyze the problem, and give lower bounds for the effect of individual edge removal in the case where the path quality function has a natural recursive property. We then propose a range of algorithms and report on experimental results on real networks derived from public biological databases. The results show that a large fraction of edges can be removed quite fast and with minimal effect on the overall graph connectivity. A rough semantic analysis of the removed edges indicates that few important edges were removed, and that the proposed approach could be a valuable tool in aiding users to view or explore weighted graphs. © 2012 Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:179 / 198
页数:19
相关论文
共 19 条
[1]  
Zhou F., Mahler S., Toivonen H., Network simplification with minimal loss of connectivity the 10th ieee, International Conference on Data Mining (ICDM), Sydney, Australia, pp. 659-668, (2010)
[2]  
Dubitzky W., Kotter T., Schmidt O., Berthold M.R., Towards creative information exploration based on koestler's concept of bisociation, Bisociative Knowledge Discovery. LNCS (LNAI, 7250, pp. 11-32, (2012)
[3]  
Zhou F., Mahler S., Toivonen H., Review of bisonet abstraction techniques, Bisociative Knowledge Discovery. LNCS (LNAI, 7250, pp. 166-178, (2012)
[4]  
Toivonen H., Mahler S., Zhou F., A framework for path-oriented network simplification, IDA 2010 LNCS, 6065, pp. 220-231, (2010)
[5]  
Sevon P., Eronen L., Hintsanen P., Kulovesi K., Toivonen H., Link discovery in graphs derived from biological databases, DILS 2006. LNCS (LNBI), 4075, pp. 35-49, (2006)
[6]  
Kruskal Jr. J., On the shortest spanning subtree of a graph and the traveling salesman problem, Proceedings of the American Mathematical society, 7, 1, pp. 48-50, (1956)
[7]  
Biedl T.C., Brejova B., Vinar T., Simplifying flow networks, MFCS 2000. LNCS, 1893, pp. 192-201, (2000)
[8]  
Misiolek E., Chen D.Z., Efficient algorithms for simplifying flow networks, COCOON 2005. LNCS, 3595, pp. 737-746, (2005)
[9]  
Schvaneveldt R., Durso F., Dearholt D., Network structures in proximity data, Psychology of Learning and Motivation: Advances in Research and Theory, 24, pp. 249-284, (1989)
[10]  
Quirin A., Cordon O., Santamaria J., Vargas-Quesada B., Moya-Anegon F., A new variant of the Pathfinder algorithm to generate large visual science maps in cubic time, Information Processing and Management, 44, pp. 1611-1623, (2008)