Improvement in TF-IDF scheme for web pages based on the contents of their hyperlinked neighboring pages

被引:0
|
作者
Sugiyama, Kazunari [1 ,3 ,4 ,5 ,6 ,7 ]
Hatano, Kenji [1 ,3 ,5 ,8 ]
Yoshikawa, Masatoshi [2 ,3 ,5 ,8 ,9 ]
Uemura, Shunsuke [1 ,3 ,5 ,7 ,10 ]
机构
[1] Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, 630-0192, Japan
[2] Information Technology Center, Nagoya University, Nagoya, 464-8601, Japan
[3] Information Processing Society of Japan
[4] Japanese Society for Artificial Intelligence
[5] Association for Computing Machinery
[6] American Association for Artificial Intelligence
[7] IEEE
[8] IEEE Computer Society
[9] Information Technology Center, Nagoya University
[10] Graduate School of Information Science, Nara Institute of Science and Technology
来源
Systems and Computers in Japan | 2005年 / 36卷 / 14期
关键词
Information retrieval - Mathematical models - Vectors;
D O I
暂无
中图分类号
学科分类号
摘要
The TF-IDF scheme is widely used to characterize documents in an information retrieval (IR) system based on the vector space model. However, for documents having a hyperlink structure such as Web pages, the Web page contents can be characterized more accurately by using the contents of hyperlinked neighboring pages. Therefore, in this paper, we propose several techniques for using the contents of hyperlinked neighboring pages to improve the TF-IDF scheme for Web pages and then verity the effectiveness of our techniques. © 2005 Wiley Periodicals, Inc.
引用
收藏
页码:56 / 68
相关论文
共 50 条
  • [1] Improvement of TF-IDF Algorithm Based on Knowledge Graph
    Wang, Yanpeng
    Zhang, Dehai
    Yuan, Ye
    Liu, Qing
    Yang, Yun
    2018 IEEE/ACIS 16TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATION (SERA), 2018, : 19 - 24
  • [2] Enhance Web Pages Genre Identification Using Neighboring Pages
    Zhu, Jia
    Zhou, Xiaofang
    Fung, Gabriel
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2011, 2011, 6997 : 282 - +
  • [3] An improvement to TF-IDF: Term distribution based term weight algorithm
    Xia T.
    Chai Y.
    Journal of Software, 2011, 6 (03) : 413 - 420
  • [4] A Novel TF-IDF Weighting Scheme for Effective Ranking
    Paik, Jiaul H.
    SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, 2013, : 343 - 352
  • [5] A Code Classification Method Based on TF-IDF
    Wang, Ke
    Jiang, Jian-Hong
    Ma, Rui-Yun
    2018 INTERNATIONAL CONFERENCE ON E-COMMERCE AND CONTEMPORARY ECONOMIC DEVELOPMENT (ECED 2018), 2018, : 13 - 17
  • [6] Research on Chinese Classification Based on TF-IDF
    Xiao, Liang
    Yao, Nianmin
    2021 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, INFORMATION AND COMMUNICATION ENGINEERING, 2021, 11933
  • [7] Discovering Informative Contents of Web Pages
    Fan, Qifeng
    Yan, Chunwei
    Huang, Lifu
    Huang, Lian'en
    WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 180 - 191
  • [8] Improvement and Application of TF-IDF Algorithm in Text Orientation Analysis
    Wang, Wei
    Tang, Yongxin
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ADVANCED MATERIALS SCIENCE AND ENVIRONMENTAL ENGINEERING, 2016, 52 : 230 - 233
  • [9] A Fragile Watermarking Scheme Based On SVD for Web Pages
    Long, Xianzhong
    Peng, Hong
    Zhang, Changle
    2009 5TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-8, 2009, : 5248 - 5251
  • [10] Research on case reasoning method based on TF-IDF
    Lin Zhang
    International Journal of System Assurance Engineering and Management, 2021, 12 : 608 - 615