RETRACTED: Arabic named entity recognition in social media based on BiLSTM-CRF using an attention mechanism (Retracted Article)

被引:3
作者
Benali, B. Ait [1 ]
Mihi, S. [1 ]
Mlouk, A. Ait [2 ]
El Bazi, I [3 ]
Laachfoubi, N. [1 ]
机构
[1] Hassan First Univ Settat, Fac Sci & Tech, IR2M Lab, Settat, Morocco
[2] Uppsala Univ, Dept Informat Technol, Div Sci Comp, Uppsala, Sweden
[3] Sultan Moulay Slimane Univ, Natl Sch Business & Management, Beni Mellal, Morocco
关键词
Arabic named entity recognition (ANER); natural language processing (NLP); multi-head self-attention; BiLSTM; CRF; dialect arabic; social media;
D O I
10.3233/JIFS-211944
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) is a vitally important task of Natural Language Processing (NLP), which aims at finding named entities in natural language text and classifying them into predefined categories such as persons (PER), places (LOC), organizations (ORG), and so on. In the Arabic context, the current NER approaches based on deep learning are mainly based on word embedding or character-level embedding as input. However, using a single granularity representation has problems with out-of-vocabulary (OOV), word embedding errors, and relatively simple semantic content. This paper presents a multi-headed self-attention mechanism implemented in the BiLSTM-CRF neural network structure to recognize Arabic named entities on social media using two embeddings. Unlike other state-of-the-art approaches, this approach combines character and word embedding at the embedding layer, and the attention mechanism calculates the similarity over the entire sequence of characters and captures local context information. The proposed approach better recognized NEs in Dialect Arabic, reaching an F1 value of 74.15% on Darwish's dataset (a publicly available Arabic NER benchmark for social media). According to our knowledge, our findings outperform the current state-of-the-art models for Arabic Named Entity Recognition on social media.
引用
收藏
页码:5427 / 5436
页数:10
相关论文
共 48 条
[1]  
Abdallah Sherief, 2012, Computational Linguistics and Intelligent Text Processing. Proceedings 13th International Conference (CICLing 2012), P311, DOI 10.1007/978-3-642-28604-9_26
[2]  
Abouenour, 2012, CEUR WORKSHOP PROC, V1178
[3]  
Ait Ben Ali, 2020, J THEOR APPL INF TEC, V98, P2963
[4]   A recent survey of Arabic named entity recognition on social media [J].
Ali B.A.B. ;
Mihi S. ;
Bazi I.E. ;
Laachfoubi N. .
Revue d'Intelligence Artificielle, 2020, 34 (02) :125-135
[5]  
Al-Sabahi, 2018, HIERARCHICAL STRUCTU, P1
[6]  
[Anonymous], 2015, P EMNLP, DOI DOI 10.18653/V1/D15-1166
[7]  
[Anonymous], 2012, WALL STR J, DOI DOI 10.1016/J.CSL.2015.07.001
[8]  
[Anonymous], 2016, ATTENTIVE POOLING NE
[9]  
Benajiba Y, 2007, IICAI, P1814
[10]  
Benali Brahim Ait, 2021, International Journal of Electrical and Computer Engineering, V11, P1485