Improving question answering performance using knowledge distillation and active learning

被引:5
|
作者
Boreshban, Yasaman [1 ]
Mirbostani, Seyed Morteza [2 ]
Ghassem-Sani, Gholamreza [1 ]
Mirroshandel, Seyed Abolghasem [2 ]
Amiriparian, Shahin [3 ]
机构
[1] Sharif Univ Technol, Comp Engn Dept, Tehran, Iran
[2] Univ Guilan, Dept Comp Engn, Rasht, Iran
[3] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
关键词
Natural language processing; Question answering; Deep learning; Knowledge distillation; Active learning; Performance;
D O I
10.1016/j.engappai.2023.106137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Contemporary question answering (QA) systems, including Transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Furthermore, training or even fine-tuning such models requires a vast amount of labeled data which is often not available for the task at hand. In this manuscript, we conduct a comprehensive analysis of the mentioned challenges and introduce suitable countermeasures. We propose a novel knowledge distillation (KD) approach to reduce the parameter and model complexity of a pre-trained bidirectional encoder representations from transformer (BERT) system and utilize multiple active learning (AL) strategies for immense reduction in annotation efforts. We show the efficacy of our approach by comparing it with four state-of-the-art (SOTA) Transformers-based systems, namely KroneckerBERT, EfficientBERT, TinyBERT, and DistilBERT. Specifically, we outperform KroneckerBERT21 and EfficientBERTTINY by 4.5 and 0.4 percentage points in EM, despite having 75.0% and 86.2% fewer parameters, respectively. Additionally, our approach achieves comparable performance to 6-layer TinyBERT and DistilBERT while using only 2% of their total trainable parameters. Besides, by the integration of our AL approaches into the BERT framework, we show that SOTA results on the QA datasets can be achieved when we only use 40% of the training data. Overall, all results demonstrate the effectiveness of our approaches in achieving SOTA performance, while extremely reducing the number of parameters and labeling efforts. Finally, we make our code publicly available at https://github.com/mirbostani/QA-KD-AL.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Knowledge Graph Embedding Based Question Answering
    Huang, Xiao
    Zhang, Jingyuan
    Li, Dingcheng
    Li, Ping
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 105 - 113
  • [32] Open domain question answering using Wikipedia-based knowledge model
    Ryu, Pum-Mo
    Jang, Myung-Gil
    Kim, Hyun-Ki
    INFORMATION PROCESSING & MANAGEMENT, 2014, 50 (05) : 683 - 692
  • [33] A building regulation question answering system: A deep learning methodology
    Zhong, Botao
    He, Wanlei
    Huang, Ziwei
    Love, Peter E. D.
    Tang, Junqing
    Luo, Hanbin
    ADVANCED ENGINEERING INFORMATICS, 2020, 46 (46)
  • [34] An improving reasoning network for complex question answering over temporal knowledge graphs
    Jiao, Songlin
    Zhu, Zhenfang
    Wu, Wenqing
    Zuo, Zicheng
    Qi, Jiangtao
    Wang, Wenling
    Zhang, Guangyuan
    Liu, Peiyu
    APPLIED INTELLIGENCE, 2023, 53 (07) : 8195 - 8208
  • [35] An improving reasoning network for complex question answering over temporal knowledge graphs
    Songlin Jiao
    Zhenfang Zhu
    Wenqing Wu
    Zicheng Zuo
    Jiangtao Qi
    Wenling Wang
    Guangyuan Zhang
    Peiyu Liu
    Applied Intelligence, 2023, 53 : 8195 - 8208
  • [36] Deep Knowledge Graph Representation Learning for Completion, Alignment, and Question Answering
    Chakrabarti, Soumen
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3451 - 3454
  • [37] Knowledge-based question answering by tree-to-sequence learning
    Zhu, Shuguang
    Cheng, Xiang
    Su, Sen
    NEUROCOMPUTING, 2020, 372 : 64 - 72
  • [38] Contrastive Representation Learning for Conversational Question Answering over Knowledge Graphs
    Kacupaj, Endri
    Singh, Kuldeep
    Maleshkova, Maria
    Lehmann, Jens
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 925 - 934
  • [39] Question Answering System using Machine Learning Techniques
    Dobrescu, Alexandra-Maria
    Radu, Serban
    VISION 2025: EDUCATION EXCELLENCE AND MANAGEMENT OF INNOVATIONS THROUGH SUSTAINABLE ECONOMIC COMPETITIVE ADVANTAGE, 2019, : 10226 - 10237
  • [40] Improving Deep Mutual Learning via Knowledge Distillation
    Lukman, Achmad
    Yang, Chuan-Kai
    APPLIED SCIENCES-BASEL, 2022, 12 (15):