PAL-BERT: An Improved Question Answering Model

被引:66
作者
Zheng, Wenfeng [1 ]
Lu, Siyu [1 ]
Cai, Zhuohang [1 ]
Wang, Ruiyang [1 ]
Wang, Lei [2 ]
Yin, Lirong [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Automat, Chengdu 610054, Peoples R China
[2] Louisiana State Univ, Dept Geog & Anthropol, Baton Rouge, LA 70803 USA
来源
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2024年 / 139卷 / 03期
关键词
PAL-BERT; question answering model; pretraining language models; ALBERT; pruning model; network pruning; TextCNN; BiLSTM;
D O I
10.32604/cmes.2023.046692
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In the field of natural language processing (NLP), there have been various pre-training language models in recent years, with question answering systems gaining significant attention. However, as algorithms, data, and computing power advance, the issue of increasingly larger models and a growing number of parameters has surfaced. Consequently, model training has become more costly and less efficient. To enhance the efficiency and accuracy of the training process while reducing the model volume, this paper proposes a first-order pruning model PAL-BERT based on the ALBERT model according to the characteristics of question-answering (QA) system and language model. Firstly, a first-order network pruning method based on the ALBERT model is designed, and the PAL-BERT model is formed. Then, the parameter optimization strategy of the PAL-BERT model is formulated, and the Mish function was used as an activation function instead of ReLU to improve the performance. Finally, after comparison experiments with traditional deep learning models TextCNN and BiLSTM, it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency. Compared with traditional models, PAL-BERT significantly improves the NLP task's performance.
引用
收藏
页码:2729 / 2745
页数:17
相关论文
共 35 条
  • [1] Automatic question-answer pairs generation and question similarity mechanism in question answering system
    Aithal, Shivani G.
    Rao, Abishek B.
    Singh, Sanjay
    [J]. APPLIED INTELLIGENCE, 2021, 51 (11) : 8484 - 8497
  • [2] Image recognition algorithm based on artificial intelligence
    Chen, Hong
    Geng, Liwei
    Zhao, Hongdong
    Zhao, Cuijie
    Liu, Aiyong
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (09) : 6661 - 6672
  • [3] A novel hierarchical structural pruning-multiscale feature fusion residual network for intelligent fault diagnosis
    Cheng, Yiwei
    Lin, Xinnuo
    Zhu, Haiping
    Wu, Jun
    Shi, Haibin
    Ding, Huafeng
    [J]. MECHANISM AND MACHINE THEORY, 2023, 184
  • [4] Cortiz D., 2022, ASS COMPUTING MACHIN, DOI [10.1145/3562007.3562051, DOI 10.1145/3562007.3562051]
  • [5] Cui YM, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P5883
  • [6] GPT-3: What's it good for?
    Dale, Robert
    [J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (01) : 113 - 118
  • [7] De Silva H, 2023, LECT NOTES COMPUT SC, V13851, P114, DOI 10.1007/978-3-031-32180-1_8
  • [8] "So what if ChatGPT wrote it?" Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
    Dwivedi, Yogesh K.
    Kshetri, Nir
    Hughes, Laurie
    Slade, Emma Louise
    Jeyaraj, Anand
    Kar, Arpan Kumar
    Baabdullah, Abdullah M.
    Koohang, Alex
    Raghavan, Vishnupriya
    Ahuja, Manju
    Albanna, Hanaa
    Albashrawi, Mousa Ahmad
    Al-Busaidi, Adil S.
    Balakrishnan, Janarthanan
    Barlette, Yves
    Basu, Sriparna
    Bose, Indranil
    Brooks, Laurence
    Buhalis, Dimitrios
    Carter, Lemuria
    Chowdhury, Soumyadeb
    Crick, Tom
    Cunningham, Scott W.
    Davies, Gareth H.
    Davison, Robert M.
    De, Rahul
    Dennehy, Denis
    Duan, Yanqing
    Dubey, Rameshwar
    Dwivedi, Rohita
    Edwards, John S.
    Flavian, Carlos
    Gauld, Robin
    Grover, Varun
    Hu, Mei-Chih
    Janssen, Marijn
    Jones, Paul
    Junglas, Iris
    Khorana, Sangeeta
    Kraus, Sascha
    Larsen, Kai R.
    Latreille, Paul
    Laumer, Sven
    Malik, F. Tegwen
    Mardani, Abbas
    Mariani, Marcello
    Mithas, Sunil
    Mogaji, Emmanuel
    Nord, Jeretta Horn
    O'Connor, Siobhan
    [J]. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2023, 71
  • [9] DepGraph: Towards Any Structural Pruning
    Fang, Gongfan
    Ma, Xinyin
    Song, Mingli
    Mi, Michael Bi
    Wang, Xinchao
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16091 - 16101
  • [10] ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
    Feng, Zhida
    Zhang, Zhenyu
    Yu, Xintong
    Fang, Yewei
    Li, Lanxin
    Chen, Xuyi
    Lu, Yuxiang
    Liu, Jiaxiang
    Yin, Weichong
    Feng, Shikun
    Sun, Yu
    Chen, Li
    Tian, Hao
    Wu, Hua
    Wang, Haifeng
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10135 - 10145