On extracting link information of relationship instances from a web site

被引:0
|
作者
Naing, MM [1 ]
Lim, EP [1 ]
Goh, DHL [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Ctr Adv Informat Syst, Singapore 639798, Singapore
来源
WEB SERVICES -ICWS-EUROPE 2003, PROCEEDINGS | 2003年 / 2853卷
关键词
ontology; information extraction; hyperlink structure;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Web pages from a web site can often be associated with concepts in an ontology, and pairs of web pages can also be associated with relationships between concepts. With such associations, web pages can be searched, browsed or even reorganized based on their concept and relationship labels. In this paper, we investigate the problem of extracting link information of relationship instances from a web site. We define the notion of link chain and formulate the link chain extraction problem. An extraction method based on sequential covering has been proposed to solve the problem. This paper presents the proposed method and the experiments to evaluate its performance. We have applied the method to extract link chain information from the Yahoo! Movie Web Site with very promising results.
引用
收藏
页码:213 / 226
页数:14
相关论文
共 50 条
  • [1] A novel method for extracting information from web pages with multiple presentation templates
    Qingzhong L.
    Yanhui D.
    An F.
    Yongquan D.
    Journal of Software, 2010, 5 (05) : 506 - 513
  • [2] Hybrid approach to extracting information from web-tables
    Jung, Sung-won
    Kang, Mi-young
    Kwon, Hyuk-chul
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 109 - +
  • [3] Information extraction from a whole Web site
    Gao, Xiaoying
    Zhang, Mengjie
    ADVANCES IN INTELLIGENT IT: ACTIVE MEDIA TECHNOLOGY 2006, 2006, 138 : 52 - +
  • [4] A Hybrid Method for Extracting Deep Web Information
    Zhang, Yuanpeng
    Wang, Li
    Jiang, Kui
    Qian, Danmin
    Dong, Jiancheng
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 777 - 782
  • [5] The Technology of Extracting Content Information from Web Page Based on DOM Tree
    Yuan, Dingrong
    Mo, Zhuoying
    Xie, Bing
    Xie, Yangcai
    ADVANCED RESEARCH ON ELECTRONIC COMMERCE, WEB APPLICATION, AND COMMUNICATION, PT 2, 2011, 144 : 271 - 278
  • [6] Beyond supervised learning of wrappers for extracting information from unseen Web sites
    Wong, TL
    Lam, W
    Wang, W
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 725 - 733
  • [7] Extracting Spatio-Temporal Information from Chinese Archaeological Site Text
    Yuan, Wenjing
    Yang, Lin
    Yang, Qing
    Sheng, Yehua
    Wang, Ziyang
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (03)
  • [8] Crawling and Extracting Process Data from the Web
    Liu, Yaling
    Agah, Arvin
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2009, 5678 : 545 - 552
  • [9] Emancipating instances from the tyranny of classes in information modeling
    Parsons, J
    Wand, Y
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2000, 25 (02): : 228 - 268
  • [10] Extracting Personal Information from Conversations
    Tigunova, Anna
    WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 284 - 288