Improving question answering performance using knowledge distillation and active learning

被引：5

作者：

Boreshban, Yasaman ^{[1
]}

Mirbostani, Seyed Morteza ^{[2
]}

Ghassem-Sani, Gholamreza ^{[1
]}

Mirroshandel, Seyed Abolghasem ^{[2
]}

Amiriparian, Shahin ^{[3
]}

机构：

[1] Sharif Univ Technol, Comp Engn Dept, Tehran, Iran

[2] Univ Guilan, Dept Comp Engn, Rasht, Iran

[3] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 123卷

关键词：

Natural language processing; Question answering; Deep learning; Knowledge distillation; Active learning; Performance;

D O I：

10.1016/j.engappai.2023.106137

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Contemporary question answering (QA) systems, including Transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Furthermore, training or even fine-tuning such models requires a vast amount of labeled data which is often not available for the task at hand. In this manuscript, we conduct a comprehensive analysis of the mentioned challenges and introduce suitable countermeasures. We propose a novel knowledge distillation (KD) approach to reduce the parameter and model complexity of a pre-trained bidirectional encoder representations from transformer (BERT) system and utilize multiple active learning (AL) strategies for immense reduction in annotation efforts. We show the efficacy of our approach by comparing it with four state-of-the-art (SOTA) Transformers-based systems, namely KroneckerBERT, EfficientBERT, TinyBERT, and DistilBERT. Specifically, we outperform KroneckerBERT21 and EfficientBERTTINY by 4.5 and 0.4 percentage points in EM, despite having 75.0% and 86.2% fewer parameters, respectively. Additionally, our approach achieves comparable performance to 6-layer TinyBERT and DistilBERT while using only 2% of their total trainable parameters. Besides, by the integration of our AL approaches into the BERT framework, we show that SOTA results on the QA datasets can be achieved when we only use 40% of the training data. Overall, all results demonstrate the effectiveness of our approaches in achieving SOTA performance, while extremely reducing the number of parameters and labeling efforts. Finally, we make our code publicly available at https://github.com/mirbostani/QA-KD-AL.

引用

页数：14

共 50 条

[31] Knowledge Graph Embedding Based Question Answering
Huang, Xiao
Zhang, Jingyuan
Li, Dingcheng
Li, Ping
PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 105 - 113
[32] Open domain question answering using Wikipedia-based knowledge model
Ryu, Pum-Mo
Jang, Myung-Gil
Kim, Hyun-Ki
INFORMATION PROCESSING & MANAGEMENT, 2014, 50 (05) : 683 - 692
[33] A building regulation question answering system: A deep learning methodology
Zhong, Botao
He, Wanlei
Huang, Ziwei
Love, Peter E. D.
Tang, Junqing
Luo, Hanbin
ADVANCED ENGINEERING INFORMATICS, 2020, 46 (46)
[34] An improving reasoning network for complex question answering over temporal knowledge graphs
Jiao, Songlin
Zhu, Zhenfang
Wu, Wenqing
Zuo, Zicheng
Qi, Jiangtao
Wang, Wenling
Zhang, Guangyuan
Liu, Peiyu
APPLIED INTELLIGENCE, 2023, 53 (07) : 8195 - 8208
[35] An improving reasoning network for complex question answering over temporal knowledge graphs
Songlin Jiao
Zhenfang Zhu
Wenqing Wu
Zicheng Zuo
Jiangtao Qi
Wenling Wang
Guangyuan Zhang
Peiyu Liu
Applied Intelligence, 2023, 53 : 8195 - 8208
[36] Deep Knowledge Graph Representation Learning for Completion, Alignment, and Question Answering
Chakrabarti, Soumen
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3451 - 3454
[37] Knowledge-based question answering by tree-to-sequence learning
Zhu, Shuguang
Cheng, Xiang
Su, Sen
NEUROCOMPUTING, 2020, 372 : 64 - 72
[38] Contrastive Representation Learning for Conversational Question Answering over Knowledge Graphs
Kacupaj, Endri
Singh, Kuldeep
Maleshkova, Maria
Lehmann, Jens
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 925 - 934
[39] Question Answering System using Machine Learning Techniques
Dobrescu, Alexandra-Maria
Radu, Serban
VISION 2025: EDUCATION EXCELLENCE AND MANAGEMENT OF INNOVATIONS THROUGH SUSTAINABLE ECONOMIC COMPETITIVE ADVANTAGE, 2019, : 10226 - 10237
[40] Improving Deep Mutual Learning via Knowledge Distillation
Lukman, Achmad
Yang, Chuan-Kai
APPLIED SCIENCES-BASEL, 2022, 12 (15):

← 1 2 3 4 5 →