Improving question answering performance using knowledge distillation and active learning

被引:5
|
作者
Boreshban, Yasaman [1 ]
Mirbostani, Seyed Morteza [2 ]
Ghassem-Sani, Gholamreza [1 ]
Mirroshandel, Seyed Abolghasem [2 ]
Amiriparian, Shahin [3 ]
机构
[1] Sharif Univ Technol, Comp Engn Dept, Tehran, Iran
[2] Univ Guilan, Dept Comp Engn, Rasht, Iran
[3] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
关键词
Natural language processing; Question answering; Deep learning; Knowledge distillation; Active learning; Performance;
D O I
10.1016/j.engappai.2023.106137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Contemporary question answering (QA) systems, including Transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Furthermore, training or even fine-tuning such models requires a vast amount of labeled data which is often not available for the task at hand. In this manuscript, we conduct a comprehensive analysis of the mentioned challenges and introduce suitable countermeasures. We propose a novel knowledge distillation (KD) approach to reduce the parameter and model complexity of a pre-trained bidirectional encoder representations from transformer (BERT) system and utilize multiple active learning (AL) strategies for immense reduction in annotation efforts. We show the efficacy of our approach by comparing it with four state-of-the-art (SOTA) Transformers-based systems, namely KroneckerBERT, EfficientBERT, TinyBERT, and DistilBERT. Specifically, we outperform KroneckerBERT21 and EfficientBERTTINY by 4.5 and 0.4 percentage points in EM, despite having 75.0% and 86.2% fewer parameters, respectively. Additionally, our approach achieves comparable performance to 6-layer TinyBERT and DistilBERT while using only 2% of their total trainable parameters. Besides, by the integration of our AL approaches into the BERT framework, we show that SOTA results on the QA datasets can be achieved when we only use 40% of the training data. Overall, all results demonstrate the effectiveness of our approaches in achieving SOTA performance, while extremely reducing the number of parameters and labeling efforts. Finally, we make our code publicly available at https://github.com/mirbostani/QA-KD-AL.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Deep learning-based question answering: a survey
    Heba Abdel-Nabi
    Arafat Awajan
    Mostafa Z. Ali
    Knowledge and Information Systems, 2023, 65 : 1399 - 1485
  • [22] A Study of Deep Learning for Factoid Question Answering System
    Day, Min-Yuh
    Kuo, Yu-Ling
    2020 IEEE 21ST INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2020), 2020, : 419 - 424
  • [23] Deep learning-based question answering: a survey
    Abdel-Nabi, Heba
    Awajan, Arafat
    Ali, Mostafa Z.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (04) : 1399 - 1485
  • [24] A question answering system in hadith using linguistic knowledge
    Abdi, Asad
    Hasan, Shafaatunnur
    Arshi, Mohammad
    Shamsuddin, SitiMariyam
    Idris, Norisma
    COMPUTER SPEECH AND LANGUAGE, 2020, 60 (60)
  • [25] Target Detection and Knowledge Learning for Domain Restricted Question Answering
    Zhang, Mengdi
    Huang, Tao
    Cao, Yixin
    Hou, Lei
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 325 - 336
  • [26] Distributed Deep Learning for Question Answering
    Feng, Minwei
    Xiang, Bing
    Zhou, Bowen
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2413 - 2416
  • [27] Complex Knowledge Base Question Answering: A Survey
    Lan, Yunshi
    He, Gaole
    Jiang, Jinhao
    Jiang, Jing
    Zhao, Wayne Xin
    Wen, Ji-Rong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11196 - 11215
  • [28] Question answering through unsupervised knowledge acquisition
    Perera, Rivindu
    Perera, Udayangi
    INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER2012), 2012, : 204 - 208
  • [29] Heterogeneous Knowledge Distillation Using Conceptual Learning
    Yu, Yerin
    Kim, Namgyu
    IEEE ACCESS, 2024, 12 : 52803 - 52814
  • [30] Cross-domain Knowledge Distillation for Retrieval-based Question Answering Systems
    Chen, Cen
    Wang, Chengyu
    Qiu, Minghui
    Gao, Dehong
    Jin, Linbo
    Li, Wang
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 2613 - 2623