Learning to Identify Narrators in Classical Arabic Texts

被引:2
作者
Alkaoud, Mohamed
Syed, Mairaj
机构
来源
AI IN COMPUTATIONAL LINGUISTICS | 2021年 / 189卷
关键词
NLP; Classical Arabic; Entity linking; Named-entity recognition; Digital humanities; Hadith science;
D O I
10.1016/j.procs.2021.05.109
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One widespread historical method of transmitting and recording information about important events and people in the Middle East is the narration-based method. In this method, each saying about a person or event is transmitted from person to person until a systematic collector records and compiles such sayings in a stable collection. At each stage of transmission, the narrator not only transmits the saying but also the person he got it from going back to the earliest narrator. Identifying each narrator in these collections is important to better measure the accuracy of the narrations and identify the date and geographies of their circulation. In this work, we propose a natural language processing technique to automate the identification of narrators in classical Arabic texts. Our proposed technique consists of two models: 1) a model for detecting the narrators in the text, and 2) a model for linking narrators to their biographies. We train our two models on a large collection of annotated classical Arabic texts and achieve F1-scores of 96.15% and 95.74% for narration detection and linking respectively. (C) 2021 The Authors. Published by Elsevier B.V.
引用
收藏
页码:335 / 342
页数:8
相关论文
共 22 条
[1]  
Altammami S., 2019, P 3 WORKSHOP ARABIC, P31
[2]  
[Anonymous], 2009, Natural language processing with Python: analyzing text with the natural language toolkit
[3]  
Antoun Wissam, 2020, P 4 WORKSH OP SOURC, P9
[4]  
Azmi Aqil, 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010), P1
[5]   Computational and natural language processing based studies of hadith literature: a survey [J].
Azmi, Aqil M. ;
Al-Qabbany, Abdulaziz O. ;
Hussain, Amir .
ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (02) :1369-1414
[6]  
Bojanowski Piotr, 2017, Transactions of the Association for Computational Linguistics, V5, P135, DOI DOI 10.1162/TACL_A_00051
[7]   On the Usage of a Classical Arabic Corpus as a Language Resource: Related Research and Key Challenges [J].
Bounhas, Ibrahim .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (03)
[8]  
Brown JonathanA. C., 2017, HADITH MUHAMMADS LEG
[9]  
Chiu J.P.C., 2016, T ASS COMPUTAT LING, V4, P357, DOI [DOI 10.1162/TACLA00104, 10.1162/tacl_a_00104]
[10]  
CIA, 2021, LEG SYST