A hybrid ontology-based information extraction system

被引:15
|
作者
Gutierrez, Fernando [1 ]
Dou, Dejing [1 ]
Fickas, Stephen [1 ]
Wimalasuriya, Daya [2 ]
Zong, Hui [3 ]
机构
[1] Univ Oregon, Eugene, OR 97403 USA
[2] Univ Moratuwa, Moratuwa, Sri Lanka
[3] Univ Virginia, Charlottesville, VA 22903 USA
基金
美国国家科学基金会;
关键词
Ensemble learning; error detection; information extraction; machine learning; ontology; RETRIEVAL; WEB;
D O I
10.1177/0165551515610989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information Extraction is the process of automatically obtaining knowledge from plain text. Because of the ambiguity of written natural language, Information Extraction is a difficult task. Ontology-based Information Extraction (OBIE) reduces this complexity by including contextual information in the form of a domain ontology. The ontology provides guidance to the extraction process by providing concepts and relationships about the domain. However, OBIE systems have not been widely adopted because of the difficulties in deployment and maintenance. The Ontology-based Components for Information Extraction (OBCIE) architecture has been proposed as a form to encourage the adoption of OBIE by promoting reusability through modularity. In this paper, we propose two orthogonal extensions to OBCIE that allow the construction of hybrid OBIE systems with higher extraction accuracy and a new functionality. The first extension utilizes OBCIE modularity to integrate different types of implementation into one extraction system, producing a more accurate extraction. For each concept or relationship in the ontology, we can select the best implementation for extraction, or we can combine both implementations under an ensemble learning schema. The second extension is a novel ontology-based error detection mechanism. Following a heuristic approach, we can identify sentences that are logically inconsistent with the domain ontology. Because the implementation strategy for the extraction of a concept is independent of the functionality of the extraction, we can design a hybrid OBIE system with concepts utilizing different implementation strategies for extracting correct or incorrect sentences. Our evaluation shows that, in the implementation extension, our proposed method is more accurate in terms of correctness and completeness of the extraction. Moreover, our error detection method can identify incorrect statements with a high accuracy.
引用
收藏
页码:798 / 820
页数:23
相关论文
共 50 条
  • [41] Ontology-based Semantic Retrieval for Management Information System
    Shen Jinxing
    ADVANCES IN MECHATRONICS AND CONTROL ENGINEERING, PTS 1-3, 2013, 278-280 : 2069 - 2072
  • [42] Ontology-based sequence labelling for automated information extraction for supporting bridge data analytics
    Liu, Kaijian
    El-Gohary, Nora
    ICSDEC 2016 - INTEGRATING DATA SCIENCE, CONSTRUCTION AND SUSTAINABILITY, 2016, 145 : 504 - 510
  • [43] Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents
    Thiam, Mouhamadou
    Bennacer, Nacera
    Pernelle, Nathalie
    Lo, Moussa
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2009, 5690 : 611 - +
  • [44] Ontology-based information extraction and integration from heterogeneous data sources
    Buitelaar, Paul
    Cimiano, Philipp
    Frank, Anette
    Hartung, Matthias
    Racloppa, Stefania
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2008, 66 (11) : 759 - 788
  • [45] ARKIVO Dataset: A Benchmark for Ontology-based Extraction Tools
    Pandolfo, Laura
    Pulina, Luca
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST), 2021, : 341 - 345
  • [46] Ontology-Based Answer Extraction Method
    Baazaoui-Zghal, Hajer
    Besbes, Ghada
    Web Reasoning and Rule Systems, RR 2014, 2014, 8741 : 239 - 240
  • [47] A Hybrid Ontology-Based Recommendation System in e-Commerce
    Guia, Marcio
    Silva, Rodrigo Rocha
    Bernardino, Jorge
    ALGORITHMS, 2019, 12 (11)
  • [48] Ontology-based knowledge management approach for information system development
    Klarin, Karmen
    Celar, Stipo
    2013 21ST TELECOMMUNICATIONS FORUM (TELFOR), 2013, : 805 - +
  • [49] Visualizations for the Spyglass Ontology-Based Information Analysis and Retrieval System
    Lin, Hong
    Rushing, John
    Berendes, Todd
    Stein, Cara
    Graves, Sara
    PROCEEDINGS OF THE 48TH ANNUAL SOUTHEAST REGIONAL CONFERENCE (ACM SE 10), 2010, : 202 - 207
  • [50] Research on Domain Ontology-based Intelligent Information Retrieval System
    Zhang Shudong
    Chen Yan
    COMPONENTS, PACKAGING AND MANUFACTURING TECHNOLOGY, 2011, 460-461 : 300 - +