Revealing connectivity structural patterns among web objects based on co-clustering of bipartite request dependency graph

被引:0
|
作者
Fang, Cheng [1 ]
Liu, Jun [1 ]
Ansari, Nirwan [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China
[2] New Jersey Inst Technol, Adv Networking Lab, Elect & Comp Engn Dept, Newark, NJ 07102 USA
关键词
Web data mining; Graph decomposition; Co-clustering; Distributed computing platform; NETWORK;
D O I
10.1007/s11276-016-1345-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Web objects are the entities retrieved from websites by users to compose the web pages. Therefore, exploring the relationships among web objects has theoretical and practical significance for many important applications, such as content recommendation, web page classification, and network security. In this paper, we propose a graph model named Bipartite Request Dependency Graph (BRDG) to investigate the relationships among web objects. To build the BRDG from massive network traffic data, we design and implement a parallel algorithm by leveraging the MapReduce programming model. Based on the study of a number of BRDGs derived from real wireless network traffic datasets, we find that the BRDG is large, sparse and complex, implying that it is very hard to derive the structural characteristics of the BRDG. Towards this end, we propose a co-clustering algorithm to decompose and extract coherent co-clusters from the BRDG. The co-clustering results of the experimental dataset reveal a number of interesting and interpretable connectivity structural patterns among web objects, which are useful for more comprehensive understanding of web page architecture and provide valuable data for e-commerce, social networking, search engine, etc.
引用
收藏
页码:439 / 451
页数:13
相关论文
共 2 条