Named Entity Recognition in User-Generated Text: A Systematic Literature Review

被引:1
作者
Esmaail, Naji [1 ,2 ]
Omar, Nazlia [1 ]
Mohd, Masnizah [1 ]
Fauzi, Fariza [1 ]
Mansur, Zainab [1 ,2 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Ctr Artificial Intelligence Technol, Bangi 43600, Malaysia
[2] Omar Al Mukhtar Univ, Fac Sci, Dept Comp Sci, Al Bayda, Libya
关键词
Social networking (online); Blogs; Reviews; Systematics; Surveys; Named entity recognition; Databases; Information retrieval; Natural language processing; NER; user-generated text; WNUT; X; systematic literature review; SLR; information extraction; natural language processing; social media; INFORMATION EXTRACTION; LINKING; TRENDS;
D O I
10.1109/ACCESS.2024.3427714
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named Entity Recognition (NER) in social media has received much research attention in the field of natural language processing (NLP) and information extraction. Research on this topic has grown dramatically in recent years. Hence, one of the objectives of this systematic literature review (SLR) is to present the outline techniques, approaches, and methods used to handle NER on X based on English datasets prepared for WNUT (Workshop on User-generated Text). This study could be used to develop more accurate models in the future. This SLR focuses on articles that had been published over the course of eight years, i.e., from July 2015 to the end of 2023. A total of 67 out of 316 articles published during the period were selected having met the set chosen criteria. Based on the analysis of the selected articles, challenges were identified and discussed. In this SLR, we aim to provide a better understanding of current viewpoints and highlight opportunities for research in NER in User-generated Text specifically for English usage on X. It can aid in identifying named entities, such as names, locations, companies, and groups, within a specific informal social media context like X. This research is notable for being the first systematic review that emphasizes the dearth of NER on X based on English datasets prepared for WNUT. The main contribution of this systematic review is a comprehensive study on NER in X messages for social media, entailing its challenges and opportunities. Moreover, new possible research directions are suggested for the researchers.
引用
收藏
页码:136330 / 136353
页数:24
相关论文
共 144 条
[1]  
Abd M. Tareq, 2017, Asia-Pacific J. Inf. Technol. Multimedia, V6, P1
[2]   A COMPARATIVE STUDY OF WORD REPRESENTATION METHODS WITH CONDITIONAL RANDOM FIELDS AND MAXIMUM ENTROPY MARKOV FOR BIO-NAMED ENTITY RECOGNITION [J].
Abdi, Maan Tareq ;
Mohd, Masnizah .
MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2018, 31 (05) :15-30
[3]  
Abdulghani M. M., 2016, Int. Rev. Autom. Control (IREACO), V9, P298
[4]   Systematic Literature Review of Information Extraction From Textual Data: Recent Methods, Applications, Trends, and Challenges [J].
Abdullah, Mohd Hafizul Afifi ;
Aziz, Norshakirah ;
Abdulkadir, Said Jadid ;
Alhussian, Hitham Seddig Alhassan ;
Talpur, Noureen .
IEEE ACCESS, 2023, 11 :10535-10562
[5]  
Aguilar G., 2018, P C N AM CHAPT ASS C, V1
[6]  
Aguilar G, 2019, Arxiv, DOI arXiv:1906.04135
[7]   A recent survey of Arabic named entity recognition on social media [J].
Ali B.A.B. ;
Mihi S. ;
Bazi I.E. ;
Laachfoubi N. .
Revue d'Intelligence Artificielle, 2020, 34 (02) :125-135
[8]  
Akbik A, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P724
[9]  
Akbik Alan, 2018, 27 INT C COMPUTATION, P1638
[10]  
Al Nabki M. W., 2019, Jornadas Nacionales de Investigacion en Ciberseguridad (JNIC), V1, P279