Information extraction meets the Semantic Web: A survey

被引:100
|
作者
Martinez-Rodriguez, Jose L. [1 ]
Hogan, Aidan [2 ,3 ]
Lopez-Arevalo, Ivan [1 ]
机构
[1] Cinvestav Tamaulipas, Ciudad Victoria, Tamaulipas, Mexico
[2] IMFD Chile, Santiago, Chile
[3] Univ Chile, Dept Comp Sci, Santiago, Chile
关键词
Information Extraction; Entity Linking; Keyword Extraction; Topic Modeling; Relation Extraction; Semantic Web; LEARNING CONCEPT HIERARCHIES; NAMED ENTITY DISAMBIGUATION; KNOWLEDGE-BASE; TERM EXTRACTION; LARGE-SCALE; ONTOLOGY; TEXT; RECOGNITION; TABLES; LINKING;
D O I
10.3233/SW-180333
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We provide a comprehensive survey of the research literature that applies Information Extraction techniques in a Semantic Web setting. Works in the intersection of these two areas can be seen from two overlapping perspectives: using Semantic Web resources (languages/ontologies/knowledge-bases/tools) to improve Information Extraction, and/or using Information Extraction to populate the Semantic Web. In more detail, we focus on the extraction and linking of three elements: entities, concepts and relations. Extraction involves identifying (textual) mentions referring to such elements in a given unstructured or semi-structured input source. Linking involves associating each such mention with an appropriate disambiguated identifier referring to the same element in a Semantic Web knowledge-base (or ontology), in some cases creating a new identifier where necessary. With respect to entities, works involving (Named) Entity Recognition, Entity Disambiguation, Entity Linking, etc. in the context of the Semantic Web are considered. With respect to concepts, works involving Terminology Extraction, Keyword Extraction, Topic Modeling, Topic Labeling, etc., in the context of the Semantic Web are considered. Finally, with respect to relations, works involving Relation Extraction in the context of the Semantic Web are considered. The focus of the majority of the survey is on works applied to unstructured sources (text in natural language); however, we also provide an overview of works that develop custom techniques adapted for semi-structured inputs, namely markup documents and web tables.
引用
收藏
页码:255 / 335
页数:81
相关论文
共 50 条
  • [41] WebOMSIE: An Ontology-Based Multi Source Web Information Extraction
    Younsi, Zineb
    Quafafou, Mohamed
    Ouzegane, Redouane
    Tari, Abdelkamel
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, 2013, 185 : 199 - +
  • [42] A survey of Semantic Web Services formalisms
    Wang, Hai H.
    Gibbins, Nick
    Payne, Terry
    Patelli, Alina
    Wang, Yangang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (15): : 4053 - 4072
  • [43] Knowledge Extraction from Question and Answer Platforms on the Semantic Web A systematic review of technologies available for information extraction
    Weerakoon, Shayne
    Poravi, Guhanathan
    2018 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION (ISMS), 2018, : 105 - 109
  • [44] Information Extraction Using Distant Supervision and Semantic Similarities
    Park, Youngmin
    Kang, Sangwoo
    Seo, Jungyun
    ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2016, 16 (01) : 11 - 18
  • [45] Information visualization and retrieval in the semantic web
    Morato, Jorge
    Sanchez-Cuadrado, Sonia
    Ruiz-Robles, Alejandro
    Moreiro-Gonzalez, Jose-Antonio
    PROFESIONAL DE LA INFORMACION, 2014, 23 (03): : 319 - 329
  • [46] An Information Retrieval Model for the Semantic Web
    Silva, Fabio
    Girardi, Rosario
    Drumond, Lucas
    PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, VOLS 1-3, 2009, : 143 - 148
  • [47] Information Architecture Automatization for the Semantic Web
    Maria Brunetti, Josep
    Garcia, Roberto
    HUMAN-COMPUTER INTERACTION - INTERACT 2011, PT IV, 2011, 6949 : 410 - 413
  • [48] The Application of Semantic Web on Agricultural Domain - A State of Art Survey
    Mohanraj, I
    Naren, J.
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 1596 - 1599
  • [49] Information architectures for semantic web applications
    Salmenjoki, K
    Tsaruk, Y
    Arumugam, G
    Industrial Applications of Semantic Web, 2005, 188 : 247 - 260
  • [50] Information Extraction Meets Crowdsourcing: A Promising Couple
    Christoph Lofi
    Joachim Selke
    Wolf-Tilo Balke
    Datenbank-Spektrum, 2012, 12 (2) : 109 - 120