Using Suffix Arrays for Efficiently Recognition of Named Entities in Large Scale

被引:0
作者
Adrian, Benjamin [1 ]
Schwarz, Sven [1 ]
机构
[1] DFKI GmbH, Knowledge Management Dept, Kaiserslautern, Germany
来源
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011 | 2011年 / 6882卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an efficient comparison of text and RDF data for recognizing named entities. Here, a named entity is a text sequence that refers to a URI reference within an RDF graph. We present suffix arrays as representation format for text and a relational database scheme to represent Semantic Web data. Using these representation facilities performs a named entity recognition in linear time complexity and without the requirement to hold names of existing entities in memory. Both is needed to implement a named entity recognition on the scale of for instance the DBpedia database.
引用
收藏
页码:420 / 429
页数:10
相关论文
共 10 条
  • [1] Adida B., 2011, INVESTIGATION CHANNE
  • [2] [Anonymous], 2001, PROC 18 INT C MACH L
  • [3] Evolving GATE to meet new challenges in language engineering
    Bontcheva, Kalina
    Tablan, Valentin
    Maynard, Diana
    Cunningham, Hamish
    [J]. Natural Language Engineering, 2004, 10 (3-4) : 349 - 373
  • [4] Linear work suffix array construction
    Karkkainen, Juha
    Sanders, Peter
    Burkhardt, Stefan
    [J]. JOURNAL OF THE ACM, 2006, 53 (06) : 918 - 936
  • [5] Manola F., 2004, INVESTIGATION CHANNE
  • [6] McCallum A, 1999, IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, P662
  • [7] McCallum Andrew Kachites, 2002, MALLET MACHINE LEARN
  • [8] TORI A, 2008, ZEMANTA SERVICE
  • [9] Ontology-based information extraction: An introduction and a survey of current approaches
    Wimalasuriya, Daya C.
    Dou, Dejing
    [J]. JOURNAL OF INFORMATION SCIENCE, 2010, 36 (03) : 306 - 323
  • [10] Zhang T, 2001, 39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P539