XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-Based Textual Knowledge Source

被引：1

作者：

Kiet Van Nguyen ^{[1
,2
]}

Phong Nguyen-Thuan Do ^{[2
]}

Nhat Duy Nguyen ^{[2
]}

Tin Van Huynh ^{[1
,2
]}

Anh Gia-Tuan Nguyen ^{[1
,2
]}

Ngan Luu-Thuy Nguyen ^{[1
,2
]}

机构：

[1] Univ Informat Technol, Fac Informat Sci & Engn, Ho Chi Minh City, Vietnam

[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam

来源：

INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I | 2022年 / 13757卷

关键词：

Question answering; Transformer; BERT; XLM-R; Transfer learning; Machine reading comprehension;

D O I：

10.1007/978-3-031-21743-2_30

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engine that can find correct answers to queries or questions in open-domain or domain-specific texts using machine reading comprehension (MRC) techniques. The majority of advancements in data resources and machine-learning approaches in the MRC and QA systems especially are developed significantly in two resource-rich languages such as English and Chinese. A low-resource language like Vietnamese has witnessed a scarcity of research on QA systems. This paper presents XLMRQA, the first Vietnamese QA system using a supervised transformer-based reader on the Wikipedia-based textual knowledge source (using the UIT-ViQuAD corpus), out-performing the two robust QA systems using deep neural network models: DrQA and BERTserini with 24.46% and 6.28%, respectively. From the results obtained on the three systems, we analyze the influence of question types on the performance of the QA systems.

引用

页码：377 / 389

页数：13

共 50 条

[1] Open domain question answering using Wikipedia-based knowledge model
Ryu, Pum-Mo
Jang, Myung-Gil
Kim, Hyun-Ki
INFORMATION PROCESSING & MANAGEMENT, 2014, 50 (05) : 683 - 692
[2] Open-domain textual question answering techniques
Harabagiu, Sanda M.
Maiorano, Steven J.
Paşca, Marius A.
Natural Language Engineering, 2003, 9 (03) : 231 - 267
[3] ViWiQA: Efficient end-to-end Vietnamese Wikipedia-based Open-domain Question-Answering systems for single-hop and multi
Nguyen, Dieu-Hien
Le, Nguyen-Khang
Nguyen, Le-Minh
INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (06)
[4] Methods for Using Textual Entailment in Open-Domain Question Answering
Harabagiu, Sanda
Hickl, Andrew
COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 905 - 912
[5] Leveraging Knowledge Graph for Open-domain Question Answering
Costa, Jose Ortiz
Kulkarni, Anagha
2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 389 - 394
[6] Recent Trends in Deep Learning Based Open-Domain Textual Question Answering Systems
Huang, Zhen
Xu, Shiyi
Hu, Minghao
Wang, Xinyi
Qiu, Jinyan
Fu, Yongquan
Zhao, Yuncai
Peng, Yuxing
Wang, Changjian
IEEE ACCESS, 2020, 8 (08): : 94341 - 94356
[7] Knowledge Graph Enabled Open-Domain Conversational Question Answering
Oduro-Afriyie, Joel
Jamil, Hasan
FLEXIBLE QUERY ANSWERING SYSTEMS, FQAS 2023, 2023, 14113 : 63 - 76
[8] Advances in open-domain question answering
Zhang, Zhi-Chang
Zhang, Yu
Liu, Ting
Li, Sheng
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2009, 37 (05): : 1058 - 1069
[9] Fusing Essential Knowledge for Text-Based Open-Domain Question Answering
Su, Xiao
Li, Ying
Wu, Zhonghai
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 627 - 639
[10] Pre-processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering
Tamber, Manveer Singh
Pradeep, Ronak
Lin, Jimmy
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 163 - 176

← 1 2 3 4 5 →