Learning Web Page Block Functions using Roles of Images

被引:1
作者
Yang, Xin [1 ]
Shi, Yuanchun [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
来源
2008 3RD INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND APPLICATIONS, VOLS 1 AND 2 | 2008年
关键词
Web Page Block; Block Function; Role of Image; Machine Learning;
D O I
10.1109/ICPCA.2008.4783565
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Making use of block information in Web IR and Data Mining tasks calls for a good understanding of the function of each block. Existing works on classifying block functions and judging block importance have not made full use of the image factor, and only simple image features were considered. We regard image as a strong indicator of Web page blocks with various functions and propose to learn block functions using roles of images as part of block features. Blocks are generated from Web page segmentation and roles of images are automatically decided by image classification. We experiment on 140 Web pages and demonstrate that utilizing roles of images can significantly improve the classification quality of learning Web page block functions. We also measure the usefulness of different roles of images and evaluate the effect of two page segmentation methods.
引用
收藏
页码:151 / 156
页数:6
相关论文
共 17 条
  • [11] Recognition of common areas in a web page using visual information: A possible application in a page classification
    Kovacevic, M
    Diligenti, M
    Gori, M
    Milutinovic, V
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 250 - 257
  • [12] Maekawa Takuya, 2006, WWW 06, P43, DOI [10.1145/1135777.1135789, DOI 10.1145/1135777.1135789]
  • [13] MUKHERJEE S, 2003, P 2 INT SEM WEB C IS, P533
  • [14] Xiang PF, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, P2253
  • [15] Effective page segmentation combining pattern analysis and visual separators for browsing on small screens
    Xiang, Peifeng
    Yang, Xin
    Shi, Yuanchun
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 831 - +
  • [16] Xiao X., 2006, P 15 ACM INT C INF K, P776
  • [17] Yi Lan., 2003, P 9 ACM SIGKDD INT C, P296, DOI DOI 10.1145/956750.956785