Developing and Pre-Processing a Dataset using a Rhetorical Relation to Build a Question-Answering System based on an Unsupervised Learning Approach

被引:0
作者
Dutta, Ashit Kumar [1 ]
Sait, Abdul Rahaman Wahab [2 ]
Keshta, Ismail Mohamed [1 ]
Elhalles, Abheer [1 ]
机构
[1] AlMaarefa Univ, Dept Comp Sci & Informat Syst, Coll Appl Sci, Riyadh 13713, Saudi Arabia
[2] King Faisal Univ, Ctr Documents & Arch, Al Hasa, Saudi Arabia
来源
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY | 2021年 / 21卷 / 11期
关键词
Arabic dataset; rhetorical relation; discourse relation; rhetorical structure theory; Question-Answering system; natural language processing;
D O I
10.22937/IJCSNS.2021.21.11.28
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Rhetorical relations between two text fragments are essential information and support natural language processing applications such as Question - Answering (QA) system and automatic text summarization to produce an effective outcome. Question Answering (QA) system facilitates users to retrieve a meaningful response. There is a demand for rhetorical relation based datasets to develop such a system to interpret and respond to user requests. There are a limited number of datasets for developing an Arabic QA system. Thus, there is a lack of an effective QA system in the Arabic language. Recent research works reveal that unsupervised learning can support the QA system to reply to users queries. In this study, researchers intend to develop a rhetorical relation based dataset for implementing unsupervised learning applications. A web crawler is developed to crawl Arabic content from the web. A discourse-annotated corpus is generated using the rhetorical structural theory. A Naive Bayes based QA system is developed to evaluate the performance of datasets. The outcome shows that the performance of the QA system is improved with proposed dataset and able to answer user queries with an appropriate response. In addition, the results on fine-grained and coarse-grained relations reveal that the dataset is highly reliable.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 28 条
[1]   A Chatbot as a Natural Web Interface to Arabic Web QA [J].
Abu Shawar, Bayan .
INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2011, 6 (01) :37-43
[2]   Automated Evaluation of School Children Essays in Arabic [J].
Al-Jouie, Maram F. ;
Azmi, Aqil M. .
ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 :19-22
[3]  
Aouladomar F., 2005, P IJCAI WORKSH KNOWL, P1
[4]  
Biltawi M, 2019, 2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), P231
[5]  
Farghaly A., 2009, ACM Trans. Asian Lang. Inf. Process, V8, P1, DOI [DOI 10.1145/1644879.1644881, 10.1145/1644879.1644881]
[6]  
Fatma Mallek, 2017, 3 INT C AR COMP LING
[7]   Automatic scoring for answers to Arabic test questions [J].
Gomaa, Wael Hassan ;
Fahmy, Aly Aly .
COMPUTER SPEECH AND LANGUAGE, 2014, 28 (04) :833-857
[8]  
Hamzah Luqman, 2018, AUTOMATIC TRANSLATIO
[9]  
Heerschop B., 2011, Proceedings of the 20th ACM International Conference on Information and Knowledge Management, P1061, DOI DOI 10.1145/2063576.2063730
[10]   DAWQAS: A Dataset for Arabic Why Question Answering System [J].
Ismail, Walaa Saber ;
Homsi, Masun Nabhan .
ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 :123-131