An Ontology-based Web Crawling Approach for the Retrieval of Materials in the Educational Domain

被引:1
|
作者
Ibrahim, Mohammed [1 ]
Yang, Yanyan [2 ]
机构
[1] Univ Portsmouth, Sch Engn, Anglesea Rd, Portsmouth PO1 3DJ, Hants, England
[2] Univ Portsmouth, Sch Comp, Anglesea Rd, Portsmouth PO1 3DJ, Hants, England
关键词
Web Crawling; Ontology; Education Domain;
D O I
10.5220/0007692009000906
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the web continues to be a huge source of information for various domains, the information available is rapidly increasing. Most of this information is stored in unstructured databases and therefore searching for relevant information becomes a complex task and the search for pertinent information within a specific domain is time-consuming and, in all probability, results in irrelevant information being retrieved. Crawling and downloading pages that are related to the user's enquiries alone is a tedious activity. In particular, crawlers focus on converting unstructured data and sorting this into a structured database. In this paper, among others kind of crawling, we focus on those techniques that extract the content of a web page based on the relations of ontology concepts. Ontology is a promising technique by which to access and crawl only related data within specific web pages or a domain. The methodology proposed is a Web Crawler approach based on Ontology (WCO) which defines several relevance computation strategies with increased efficiency thereby reducing the number of extracted items in addition to the crawling time. It seeks to select and search out web pages in the education domain that matches the user's requirements. In WCO, data is structured based on the hierarchical relationship, the concepts which are adapted in the ontology domain. The approach is flexible for application to crawler items for different domains by adapting user requirements in defining several relevance computation strategies with promising results.
引用
收藏
页码:900 / 906
页数:7
相关论文
共 50 条
  • [1] AN APPROACH FOR DOMAIN ONTOLOGY-BASED SEMANTIC RETRIEVAL
    Qi, Zhimiao
    Zheng, Yan
    Jiang, Xiaoxiao
    CIICT 2008: PROCEEDINGS OF CHINA-IRELAND INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATIONS TECHNOLOGIES 2008, 2008, : 185 - 189
  • [2] An Ontology-Based Approach for Geographic Information Retrieval on the Web
    Kun, Mei
    Fuling, Bian
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5959 - 5962
  • [3] Domain ontology-based Web Cross Language Information Retrieval
    Cheng, Xiao-rong
    Guo, Hao-jun
    Wang, Hai-jiao
    He, Wei
    ADVANCING SCIENCE THROUGH COMPUTATION, 2008, : 129 - 133
  • [4] Ontology-based focused crawling of Deep Web sources
    Fang, Wei
    Cui, Zhiming
    Zhao, Pengpeng
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 514 - 519
  • [5] An ontology-based approach to learnable focused crawling
    Zheng, Hai-Tao
    Kang, Bo-Yeong
    Kim, Hong-Gee
    INFORMATION SCIENCES, 2008, 178 (23) : 4512 - 4522
  • [6] A Domain Ontology-based Information Retrieval Approach for Technique Preparation
    Liu Xinhua
    Zhang Xutang
    Li Zhongkai
    INTERNATIONAL CONFERENCE ON SOLID STATE DEVICES AND MATERIALS SCIENCE, 2012, 25 : 1582 - 1588
  • [7] Ontology-based Focused Crawling
    Luong, Hiep Phuc
    Gauch, Susan
    Wang, Qiang
    INTERNATIONAL CONFERENCE ON INFORMATION, PROCESS, AND KNOWLEDGE MANAGEMENT: EKNOW 2009, PROCEEDINGS, 2009, : 123 - 128
  • [8] An Ontology-Based Approach to Information Retrieval
    Mestrovic, Ana
    Cali, Andrea
    SEMANTIC KEYWORD-BASED SEARCH ON STRUCTURED DATA SOURCES, IKC 2016, 2017, 10151 : 150 - 156
  • [9] Ontology based web crawling - A novel approach
    Ganesh, S
    ADVANCES IN WEB INTELLIGENCE, PROCEEDINGS, 2005, 3528 : 140 - 149
  • [10] Study of ontology-based domain retrieval efficiency
    Bao, Hong
    Zhang, Wenjia
    Energy Education Science and Technology Part A: Energy Science and Research, 2014, 32 (06): : 6185 - 6196