A hybrid ontology-based information extraction system

被引:15
|
作者
Gutierrez, Fernando [1 ]
Dou, Dejing [1 ]
Fickas, Stephen [1 ]
Wimalasuriya, Daya [2 ]
Zong, Hui [3 ]
机构
[1] Univ Oregon, Eugene, OR 97403 USA
[2] Univ Moratuwa, Moratuwa, Sri Lanka
[3] Univ Virginia, Charlottesville, VA 22903 USA
基金
美国国家科学基金会;
关键词
Ensemble learning; error detection; information extraction; machine learning; ontology; RETRIEVAL; WEB;
D O I
10.1177/0165551515610989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information Extraction is the process of automatically obtaining knowledge from plain text. Because of the ambiguity of written natural language, Information Extraction is a difficult task. Ontology-based Information Extraction (OBIE) reduces this complexity by including contextual information in the form of a domain ontology. The ontology provides guidance to the extraction process by providing concepts and relationships about the domain. However, OBIE systems have not been widely adopted because of the difficulties in deployment and maintenance. The Ontology-based Components for Information Extraction (OBCIE) architecture has been proposed as a form to encourage the adoption of OBIE by promoting reusability through modularity. In this paper, we propose two orthogonal extensions to OBCIE that allow the construction of hybrid OBIE systems with higher extraction accuracy and a new functionality. The first extension utilizes OBCIE modularity to integrate different types of implementation into one extraction system, producing a more accurate extraction. For each concept or relationship in the ontology, we can select the best implementation for extraction, or we can combine both implementations under an ensemble learning schema. The second extension is a novel ontology-based error detection mechanism. Following a heuristic approach, we can identify sentences that are logically inconsistent with the domain ontology. Because the implementation strategy for the extraction of a concept is independent of the functionality of the extraction, we can design a hybrid OBIE system with concepts utilizing different implementation strategies for extracting correct or incorrect sentences. Our evaluation shows that, in the implementation extension, our proposed method is more accurate in terms of correctness and completeness of the extraction. Moreover, our error detection method can identify incorrect statements with a high accuracy.
引用
收藏
页码:798 / 820
页数:23
相关论文
共 50 条
  • [1] Ontology-based Drug Product Information Extraction System
    Li, Wen-jie
    Shen, Nan
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 1672 - +
  • [2] Ontology-based information retrieval and extraction
    Lee, CY
    Soo, VW
    ITRE 2005: 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: RESEARCH AND EDUCATION, PROCEEDINGS, 2005, : 265 - 269
  • [3] Towards a System for Ontology-Based Information Extraction from PDF Documents
    Oro, Ermelinda
    Ruffolo, Massimo
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2008, PT II, PROCEEDINGS, 2008, 5332 : 1482 - 1499
  • [4] Ontology-Based Web Information Extraction
    Mo, Qian
    Chen, Yi-hong
    COMMUNICATIONS AND INFORMATION PROCESSING, PT 1, 2012, 288 : 118 - 126
  • [5] Ontology-based Information Extraction for Knowledge Enrichment and Validation
    Fudholi, Dhomas Hatta
    Rahayu, Wenny
    Pardede, Eric
    IEEE 30TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS IEEE AINA 2016, 2016, : 1116 - 1123
  • [6] Ontology-Based Information Extraction from Spanish Forum
    Pena, Willy
    Melgar, Andres
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 351 - 360
  • [7] Using Lexical Chain in Ontology-Based Information Extraction
    Cong, Chunyu
    Gao, Rui
    Wang, Zhongying
    Meng, Xiao
    Proceedings of the 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016), 2016, 67 : 312 - 316
  • [8] An Improved Ontology-Based Web Information Extraction
    Zhang, Jing
    Ding, Wei Ze
    2015 INTERNATIONAL CONFERENCE OF EDUCATIONAL INNOVATION THROUGH TECHNOLOGY - EITT 2015, 2015, : 37 - 41
  • [9] Ontology-based information extraction and information retrieval in health care domain
    Dung, Tran Quoc
    Kameyama, Wataru
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2007, 4654 : 323 - +
  • [10] An Ontology-Based Information Extraction System for Residential Land-Use Suitability Analysis
    Al-Ageili, Munira
    Mouhoub, Malek
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2022, 32 (07) : 1019 - 1042