C&C: An Effective Algorithm for Extracting Web Community Cores

被引:0
作者
Zhang, Xianchao [1 ]
Li, Yueting [1 ]
Liang, Wenxin [1 ]
机构
[1] Dalian Univ Technol, Sch Software, Dalian, Peoples R China
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS | 2010年 / 6193卷
关键词
Web mining; Community core; Bipartite graph;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Communities is a significant pattern of the Web. A community is a group of pages related to a common topic. Web communities are able to be characterized by dense bipartite subgraphs. Each community almost surely contains at least one core. A core is a complete bipartite graph (CBG). Focusing on the issues of extracting such community cores from the Web, in this paper we propose an effective C & C algorithm based on combination and consolidation to extract all embedded cores in web graphs. Experiments on real and large data collections demonstrate that the proposed algorithm C & C is efficient and effective for the community core extraction because: 1) all the largest emerging cores can be identified; 2) identifying all the embedded cores with different sizes only requires one-pass execution of C & C; 3) the extraction process needs no user-determined parameters in C & C.
引用
收藏
页码:316 / 326
页数:11
相关论文
共 16 条
  • [1] Power-Law distribution of the World Wide Web
    Adamic, LA
    Huberman, BA
    Barabási, AL
    Albert, R
    Jeong, H
    Bianconi, G
    [J]. SCIENCE, 2000, 287 (5461)
  • [2] Agrawal R., P 20 INT C VERY LARG
  • [3] [Anonymous], 2007, P 16 INT C WORLD WID
  • [4] [Anonymous], 2001, Proceedings, DOI [DOI 10.1145/371920.372096, 10.1145/371920, DOI 10.1145/371920]
  • [5] [Anonymous], J COMPUTER MEDIATED
  • [6] [Anonymous], 2005, PVLDB
  • [7] [Anonymous], 8 INT WORLD WID WEB
  • [8] Boldi P., 2004, P 13 INT C WORLD WID, P595
  • [9] Flake G. W., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P150, DOI 10.1145/347090.347121
  • [10] Self-organization and identification of web communities
    Flake, GW
    Lawrence, S
    Giles, CL
    Coetzee, FM
    [J]. COMPUTER, 2002, 35 (03) : 66 - +