XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-Based Textual Knowledge Source

被引:1
|
作者
Kiet Van Nguyen [1 ,2 ]
Phong Nguyen-Thuan Do [2 ]
Nhat Duy Nguyen [2 ]
Tin Van Huynh [1 ,2 ]
Anh Gia-Tuan Nguyen [1 ,2 ]
Ngan Luu-Thuy Nguyen [1 ,2 ]
机构
[1] Univ Informat Technol, Fac Informat Sci & Engn, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
关键词
Question answering; Transformer; BERT; XLM-R; Transfer learning; Machine reading comprehension;
D O I
10.1007/978-3-031-21743-2_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engine that can find correct answers to queries or questions in open-domain or domain-specific texts using machine reading comprehension (MRC) techniques. The majority of advancements in data resources and machine-learning approaches in the MRC and QA systems especially are developed significantly in two resource-rich languages such as English and Chinese. A low-resource language like Vietnamese has witnessed a scarcity of research on QA systems. This paper presents XLMRQA, the first Vietnamese QA system using a supervised transformer-based reader on the Wikipedia-based textual knowledge source (using the UIT-ViQuAD corpus), out-performing the two robust QA systems using deep neural network models: DrQA and BERTserini with 24.46% and 6.28%, respectively. From the results obtained on the three systems, we analyze the influence of question types on the performance of the QA systems.
引用
收藏
页码:377 / 389
页数:13
相关论文
共 50 条
  • [21] WabiQA: A Wikipedia-Based Thai Question-Answering System
    Noraset, Thanapon
    Lowphansirikul, Lalita
    Tuarob, Suppawong
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (01)
  • [22] PyGaggle: A Gaggle of Resources for Open-Domain Question Answering
    Pradeep, Ronak
    Chen, Haonan
    Gu, Lingwei
    Tamber, Manveer Singh
    Lin, Jimmy
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 148 - 162
  • [23] Adaptive Information Seeking for Open-Domain Question Answering
    Zhu, Yunchang
    Pang, Liang
    Lan, Yanyan
    Shen, Huawei
    Cheng, Xueqi
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3615 - 3626
  • [24] RRQA: reconfirmed reader for open-domain question answering
    Li, Shi
    Zhang, Wenqian
    APPLIED INTELLIGENCE, 2023, 53 (15) : 18420 - 18430
  • [25] Dense Hierarchical Retrieval for Open-Domain Question Answering
    Liu, Ye
    Hashimoto, Kazuma
    Zhou, Yingbo
    Yavuz, Semih
    Xiong, Caiming
    Yu, Philip S.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 188 - 200
  • [26] ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
    Lee, Chia-Hsuan
    Wang, Shang-Ming
    Chang, Huan-Cheng
    Lee, Hung-Yi
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 949 - 956
  • [27] Query Context Expansion for Open-Domain Question Answering
    Zhu, Wenhao
    Zhang, Xiaoyu
    Ye, Liang
    Zhai, Qiuhong
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
  • [28] Adaptive Batch Scheduling for Open-Domain Question Answering
    Choi, Donghyun
    Shin, Myeongcheol
    Kim, Eunggyun
    Shin, Dong Ryeol
    IEEE ACCESS, 2021, 9 : 112097 - 112103
  • [29] RRQA: reconfirmed reader for open-domain question answering
    Shi Li
    Wenqian Zhang
    Applied Intelligence, 2023, 53 : 18420 - 18430
  • [30] Designing an interactive open-domain question answering system
    Quarteroni, S.
    Manandhar, S.
    NATURAL LANGUAGE ENGINEERING, 2009, 15 : 73 - 95