PAL-BERT: An Improved Question Answering Model

被引：66

作者：

Zheng, Wenfeng ^{[1
]}

Lu, Siyu ^{[1
]}

Cai, Zhuohang ^{[1
]}

Wang, Ruiyang ^{[1
]}

Wang, Lei ^{[2
]}

Yin, Lirong ^{[2
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Automat, Chengdu 610054, Peoples R China

[2] Louisiana State Univ, Dept Geog & Anthropol, Baton Rouge, LA 70803 USA

来源：

CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2024年 / 139卷 / 03期

关键词：

PAL-BERT; question answering model; pretraining language models; ALBERT; pruning model; network pruning; TextCNN; BiLSTM;

D O I：

10.32604/cmes.2023.046692

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

In the field of natural language processing (NLP), there have been various pre-training language models in recent years, with question answering systems gaining significant attention. However, as algorithms, data, and computing power advance, the issue of increasingly larger models and a growing number of parameters has surfaced. Consequently, model training has become more costly and less efficient. To enhance the efficiency and accuracy of the training process while reducing the model volume, this paper proposes a first-order pruning model PAL-BERT based on the ALBERT model according to the characteristics of question-answering (QA) system and language model. Firstly, a first-order network pruning method based on the ALBERT model is designed, and the PAL-BERT model is formed. Then, the parameter optimization strategy of the PAL-BERT model is formulated, and the Mish function was used as an activation function instead of ReLU to improve the performance. Finally, after comparison experiments with traditional deep learning models TextCNN and BiLSTM, it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency. Compared with traditional models, PAL-BERT significantly improves the NLP task's performance.

引用

页码：2729 / 2745

页数：17

共 35 条

[1] Automatic question-answer pairs generation and question similarity mechanism in question answering system
Aithal, Shivani G.
Rao, Abishek B.
Singh, Sanjay
[J]. APPLIED INTELLIGENCE, 2021, 51 (11) : 8484 - 8497
[2] Image recognition algorithm based on artificial intelligence
Chen, Hong
Geng, Liwei
Zhao, Hongdong
Zhao, Cuijie
Liu, Aiyong
[J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (09) : 6661 - 6672
[3] A novel hierarchical structural pruning-multiscale feature fusion residual network for intelligent fault diagnosis
Cheng, Yiwei
Lin, Xinnuo
Zhu, Haiping
Wu, Jun
Shi, Haibin
Ding, Huafeng
[J]. MECHANISM AND MACHINE THEORY, 2023, 184
[4] Cortiz D., 2022, ASS COMPUTING MACHIN, DOI [10.1145/3562007.3562051, DOI 10.1145/3562007.3562051]
[5] Cui YM, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P5883
[6] GPT-3: What's it good for?
Dale, Robert
[J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (01) : 113 - 118
[7] De Silva H, 2023, LECT NOTES COMPUT SC, V13851, P114, DOI 10.1007/978-3-031-32180-1_8
[8] "So what if ChatGPT wrote it?" Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
Dwivedi, Yogesh K.
Kshetri, Nir
Hughes, Laurie
Slade, Emma Louise
Jeyaraj, Anand
Kar, Arpan Kumar
Baabdullah, Abdullah M.
Koohang, Alex
Raghavan, Vishnupriya
Ahuja, Manju
Albanna, Hanaa
Albashrawi, Mousa Ahmad
Al-Busaidi, Adil S.
Balakrishnan, Janarthanan
Barlette, Yves
Basu, Sriparna
Bose, Indranil
Brooks, Laurence
Buhalis, Dimitrios
Carter, Lemuria
Chowdhury, Soumyadeb
Crick, Tom
Cunningham, Scott W.
Davies, Gareth H.
Davison, Robert M.
De, Rahul
Dennehy, Denis
Duan, Yanqing
Dubey, Rameshwar
Dwivedi, Rohita
Edwards, John S.
Flavian, Carlos
Gauld, Robin
Grover, Varun
Hu, Mei-Chih
Janssen, Marijn
Jones, Paul
Junglas, Iris
Khorana, Sangeeta
Kraus, Sascha
Larsen, Kai R.
Latreille, Paul
Laumer, Sven
Malik, F. Tegwen
Mardani, Abbas
Mariani, Marcello
Mithas, Sunil
Mogaji, Emmanuel
Nord, Jeretta Horn
O'Connor, Siobhan
[J]. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2023, 71
[9] DepGraph: Towards Any Structural Pruning
Fang, Gongfan
Ma, Xinyin
Song, Mingli
Mi, Michael Bi
Wang, Xinchao
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16091 - 16101
[10] ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
Feng, Zhida
Zhang, Zhenyu
Yu, Xintong
Fang, Yewei
Li, Lanxin
Chen, Xuyi
Lu, Yuxiang
Liu, Jiaxiang
Yin, Weichong
Feng, Shikun
Sun, Yu
Chen, Li
Tian, Hao
Wu, Hua
Wang, Haifeng
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10135 - 10145

← 1 2 3 4 →