Nested Named Entity Recognition: A Survey

被引:34
作者
Wang, Yu [1 ]
Tong, Hanghang [2 ]
Zhu, Ziye [1 ]
Li, Yun [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, 9 Wenyuan Rd, Nanjing 210023, Jiangsu, Peoples R China
[2] Univ Illinois, 201 North Goodwin Ave, Urbana, IL 61801 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Nested named entity recognition; named entity recognition; information extraction; natural language processing; text mining; RECOGNIZING NAMES; BIOMEDICAL TEXTS; CLASSIFICATION; MODEL;
D O I
10.1145/3522593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of text mining, many studies observe that text generally contains a variety of implicit information, and it is important to develop techniques for extracting such information. Named Entity Recognition (NER), the first step of information extraction, mainly identifies names of persons, locations, and organizations in text. Although existing neural-based NER approaches achieve great success in many language domains, most of them normally ignore the nested nature of named entities. Recently, diverse studies focus on the nested NER problem and yield state-of-the-art performance. This survey attempts to provide a comprehensive review on existing approaches for nested NER from the perspectives of the model architecture and the model property, which may help readers have a better understanding of the current research status and ideas. In this survey, we first introduce the background of nested NER, especially the differences between nested NER and traditional (i.e., flat) NER. We then review the existing nested NER approaches from 2002 to 2020 and mainly classify them into five categories according to the model architecture, including early rule-based, layered-based, region-based, hypergraph-based, and transition-based approaches. We also explore in greater depth the impact of key properties unique to nested NER approaches from the model property perspective, namely entity dependency, stage framework, error propagation, and tag scheme. Finally, we summarize the open challenges and point out a few possible future directions in this area. This survey would be useful for three kinds of readers: (i) Newcomers in the field who want to learn about NER, especially for nested NER. (ii) Researchers who want to clarify the relationship and advantages between flat NER and nested NER. (iii) Practitioners who just need to determine which NER technique (i.e., nested or not) works best in their applications.
引用
收藏
页数:29
相关论文
共 70 条
  • [31] Ling X., 2012, PROC 26 AAAI C ARTIF, P94, DOI 10.1609/aaai.v26i1.8122
  • [32] Long XW, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P4788
  • [33] Lu W., 2015, P 2015 C EMP METH NA, P857, DOI [DOI 10.18653/V1/D15-1102, DOI 10.18653/V1/D15]
  • [34] Luo Y, 2020, P 58 ANN M ASS COMP, P6408
  • [35] Marinho Z., 2019, P 2 CLIN NAT LANG PR, P28, DOI DOI 10.18653/V1/W19-1904
  • [36] Named Entity Recognition: Fallacies, challenges and opportunities
    Marrero, Monica
    Urbano, Julian
    Sanchez-Cuadrado, Sonia
    Morato, Jorge
    Miguel Gomez-Berbis, Juan
    [J]. COMPUTER STANDARDS & INTERFACES, 2013, 35 (05) : 482 - 489
  • [37] Muis Aldrian Obaja, 2017, P 2017 C EMP METH NA, P2608
  • [38] Nadeau D, 2007, LINGUIST INVESTIG, V30, P3
  • [39] Nivre J., 2008, P ACL, P950
  • [40] Ouchi Hiroki, 2020, P ACL 20, P6452, DOI 10.18653/v1/2020.acl