Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents

被引:0
作者
Thiam, Mouhamadou [1 ,3 ]
Bennacer, Nacera [2 ]
Pernelle, Nathalie [1 ]
Lo, Moussa [3 ]
机构
[1] Univ Paris 11, LRI, INRIA Saclay Ile France, 2-4 Rue Jacques Monod, F-91893 Orsay, France
[2] SUPELEC, F-91192 Gif Sur Yvette, France
[3] Univ Gaston Berger, UFR SAT, LANI, St Louis, France
来源
DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS | 2009年 / 5690卷
关键词
Information Extraction; Semantic Annotation; Alignment; Ontology; Semi-structured documents; OWL; RDF/RDFS; WEB;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
SHIRI1 is an ontology-based system for integration of semi-structured documents related to a, specific domain. The system's purpose is to allow users to access to relevant parts of documents as answers to then, queries. SHIM uses RDF/OWL for representation of resources and SPARQL for their querying. It relies oil all automatic, unsupervised and ontology-driven approach for extraction, alignment and semantic annotation of tagged elements of documents. In this paper, We focus Oil the Extract-Align algorithm which exploits a set. of named entity and term patterns to extract term candidates to be aligned with the Ontology. It proceeds in all incremental manner in order to populate the Ontology with terms describing instances, of the domain and to reduce the access to extern resources such as Web. We experiment it on a, HTML corpus related to call for papers in computer science and die results that we obtain are very promising. These results show how the incremental behaviour algorithm enriches the ontology and the number of terms (Or named entities) aligned directly with the Ontology increases.
引用
收藏
页码:611 / +
页数:2
相关论文
共 15 条
  • [1] Arppe Antti, 1995, NORD C COMP LING NOD
  • [2] Baumgartner R., 2001, Proceedings of the 27th International Conference on Very Large Data Bases, P119
  • [3] CAFARELLA MJ, 2008, P WEBDB CAN
  • [4] Cimiano P., 2005, WWW C
  • [5] Cohen W.W., 2003, IIWeb, V3, P73
  • [6] Crescenzi V., 2001, VER LARG DAT BAS C V
  • [7] Davulcu H., 2005, International Journal of Web and Grid Services, V1, P196, DOI 10.1504/IJWGS.2005.008320
  • [8] Drouin P., 2003, Terminology, V9, P99, DOI 10.1075/term.9.1.06dro
  • [9] Unsupervised named-entity extraction from the Web: An experimental study
    Etzioni, O
    Cafarella, M
    Downey, D
    Popescu, AM
    Shaked, T
    Soderland, S
    Weld, DS
    Yates, A
    [J]. ARTIFICIAL INTELLIGENCE, 2005, 165 (01) : 91 - 134
  • [10] HAMDI F, 2008, ONTOLOGY ALIGNMENT E