XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-Based Textual Knowledge Source

被引:1
|
作者
Kiet Van Nguyen [1 ,2 ]
Phong Nguyen-Thuan Do [2 ]
Nhat Duy Nguyen [2 ]
Tin Van Huynh [1 ,2 ]
Anh Gia-Tuan Nguyen [1 ,2 ]
Ngan Luu-Thuy Nguyen [1 ,2 ]
机构
[1] Univ Informat Technol, Fac Informat Sci & Engn, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
关键词
Question answering; Transformer; BERT; XLM-R; Transfer learning; Machine reading comprehension;
D O I
10.1007/978-3-031-21743-2_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engine that can find correct answers to queries or questions in open-domain or domain-specific texts using machine reading comprehension (MRC) techniques. The majority of advancements in data resources and machine-learning approaches in the MRC and QA systems especially are developed significantly in two resource-rich languages such as English and Chinese. A low-resource language like Vietnamese has witnessed a scarcity of research on QA systems. This paper presents XLMRQA, the first Vietnamese QA system using a supervised transformer-based reader on the Wikipedia-based textual knowledge source (using the UIT-ViQuAD corpus), out-performing the two robust QA systems using deep neural network models: DrQA and BERTserini with 24.46% and 6.28%, respectively. From the results obtained on the three systems, we analyze the influence of question types on the performance of the QA systems.
引用
收藏
页码:377 / 389
页数:13
相关论文
共 50 条
  • [1] Open domain question answering using Wikipedia-based knowledge model
    Ryu, Pum-Mo
    Jang, Myung-Gil
    Kim, Hyun-Ki
    INFORMATION PROCESSING & MANAGEMENT, 2014, 50 (05) : 683 - 692
  • [2] Open-domain textual question answering techniques
    Harabagiu, Sanda M.
    Maiorano, Steven J.
    Paşca, Marius A.
    Natural Language Engineering, 2003, 9 (03) : 231 - 267
  • [3] ViWiQA: Efficient end-to-end Vietnamese Wikipedia-based Open-domain Question-Answering systems for single-hop and multi
    Nguyen, Dieu-Hien
    Le, Nguyen-Khang
    Nguyen, Le-Minh
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (06)
  • [4] Methods for Using Textual Entailment in Open-Domain Question Answering
    Harabagiu, Sanda
    Hickl, Andrew
    COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 905 - 912
  • [5] Leveraging Knowledge Graph for Open-domain Question Answering
    Costa, Jose Ortiz
    Kulkarni, Anagha
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 389 - 394
  • [6] Recent Trends in Deep Learning Based Open-Domain Textual Question Answering Systems
    Huang, Zhen
    Xu, Shiyi
    Hu, Minghao
    Wang, Xinyi
    Qiu, Jinyan
    Fu, Yongquan
    Zhao, Yuncai
    Peng, Yuxing
    Wang, Changjian
    IEEE ACCESS, 2020, 8 (08): : 94341 - 94356
  • [7] Knowledge Graph Enabled Open-Domain Conversational Question Answering
    Oduro-Afriyie, Joel
    Jamil, Hasan
    FLEXIBLE QUERY ANSWERING SYSTEMS, FQAS 2023, 2023, 14113 : 63 - 76
  • [8] Advances in open-domain question answering
    Zhang, Zhi-Chang
    Zhang, Yu
    Liu, Ting
    Li, Sheng
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2009, 37 (05): : 1058 - 1069
  • [9] Fusing Essential Knowledge for Text-Based Open-Domain Question Answering
    Su, Xiao
    Li, Ying
    Wu, Zhonghai
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 627 - 639
  • [10] Pre-processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering
    Tamber, Manveer Singh
    Pradeep, Ronak
    Lin, Jimmy
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 163 - 176