Improving question answering performance using knowledge distillation and active learning

被引：5

作者：

Boreshban, Yasaman ^{[1
]}

Mirbostani, Seyed Morteza ^{[2
]}

Ghassem-Sani, Gholamreza ^{[1
]}

Mirroshandel, Seyed Abolghasem ^{[2
]}

Amiriparian, Shahin ^{[3
]}

机构：

[1] Sharif Univ Technol, Comp Engn Dept, Tehran, Iran

[2] Univ Guilan, Dept Comp Engn, Rasht, Iran

[3] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 123卷

关键词：

Natural language processing; Question answering; Deep learning; Knowledge distillation; Active learning; Performance;

D O I：

10.1016/j.engappai.2023.106137

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Contemporary question answering (QA) systems, including Transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Furthermore, training or even fine-tuning such models requires a vast amount of labeled data which is often not available for the task at hand. In this manuscript, we conduct a comprehensive analysis of the mentioned challenges and introduce suitable countermeasures. We propose a novel knowledge distillation (KD) approach to reduce the parameter and model complexity of a pre-trained bidirectional encoder representations from transformer (BERT) system and utilize multiple active learning (AL) strategies for immense reduction in annotation efforts. We show the efficacy of our approach by comparing it with four state-of-the-art (SOTA) Transformers-based systems, namely KroneckerBERT, EfficientBERT, TinyBERT, and DistilBERT. Specifically, we outperform KroneckerBERT21 and EfficientBERTTINY by 4.5 and 0.4 percentage points in EM, despite having 75.0% and 86.2% fewer parameters, respectively. Additionally, our approach achieves comparable performance to 6-layer TinyBERT and DistilBERT while using only 2% of their total trainable parameters. Besides, by the integration of our AL approaches into the BERT framework, we show that SOTA results on the QA datasets can be achieved when we only use 40% of the training data. Overall, all results demonstrate the effectiveness of our approaches in achieving SOTA performance, while extremely reducing the number of parameters and labeling efforts. Finally, we make our code publicly available at https://github.com/mirbostani/QA-KD-AL.

引用

页数：14

共 50 条

[21] Deep learning-based question answering: a survey
Heba Abdel-Nabi
Arafat Awajan
Mostafa Z. Ali
Knowledge and Information Systems, 2023, 65 : 1399 - 1485
[22] A Study of Deep Learning for Factoid Question Answering System
Day, Min-Yuh
Kuo, Yu-Ling
2020 IEEE 21ST INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2020), 2020, : 419 - 424
[23] Deep learning-based question answering: a survey
Abdel-Nabi, Heba
Awajan, Arafat
Ali, Mostafa Z.
KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (04) : 1399 - 1485
[24] A question answering system in hadith using linguistic knowledge
Abdi, Asad
Hasan, Shafaatunnur
Arshi, Mohammad
Shamsuddin, SitiMariyam
Idris, Norisma
COMPUTER SPEECH AND LANGUAGE, 2020, 60 (60)
[25] Target Detection and Knowledge Learning for Domain Restricted Question Answering
Zhang, Mengdi
Huang, Tao
Cao, Yixin
Hou, Lei
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 325 - 336
[26] Distributed Deep Learning for Question Answering
Feng, Minwei
Xiang, Bing
Zhou, Bowen
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2413 - 2416
[27] Complex Knowledge Base Question Answering: A Survey
Lan, Yunshi
He, Gaole
Jiang, Jinhao
Jiang, Jing
Zhao, Wayne Xin
Wen, Ji-Rong
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11196 - 11215
[28] Question answering through unsupervised knowledge acquisition
Perera, Rivindu
Perera, Udayangi
INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER2012), 2012, : 204 - 208
[29] Heterogeneous Knowledge Distillation Using Conceptual Learning
Yu, Yerin
Kim, Namgyu
IEEE ACCESS, 2024, 12 : 52803 - 52814
[30] Cross-domain Knowledge Distillation for Retrieval-based Question Answering Systems
Chen, Cen
Wang, Chengyu
Qiu, Minghui
Gao, Dehong
Jin, Linbo
Li, Wang
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 2613 - 2623

← 1 2 3 4 5 →