Knowledge enhancement BERT based on domain dictionary mask

被引:0
作者
Cao, Xianglin [1 ]
Xiao, Hong [1 ]
Jiang, Wenchao [1 ]
机构
[1] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou, Peoples R China
关键词
Intelligent customer service; dictionary mask; BERT; data preprocessing;
D O I
10.3233/JHS-222013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic matching is one of the critical technologies for intelligent customer service. Since Bidirectional Encoder Representations from Transformers (BERT) is proposed, fine-tuning on a large-scale pre-training language model becomes a general method to implement text semantic matching. However, in practical application, the accuracy of the BERT model is limited by the quantity of pre-training corpus and proper nouns in the target domain. An enhancement method for knowledge based on domain dictionary to mask input is proposed to solve the problem. Firstly, for modul input, we use keyword matching to recognize and mask the word in domain. Secondly, using self-supervised learning to inject knowledge of the target domain into the BERT model. Thirdly, we fine-tune the BERT model with public datasets LCQMC and BQboost. Finally, we test the model's performance with a financial company's user data. The experimental results show that after using our method and BQboost, accuracy increases by 12.12% on average in practical applications.
引用
收藏
页码:121 / 128
页数:8
相关论文
共 24 条
  • [1] Chen J, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P4946
  • [2] Chen Y., 2022, J PEKING U NATURAL S, V58, P8
  • [3] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [4] Feng Z., 2021, FOREIGN LANGUAGE STU, V38, P1
  • [5] [韩俊英 Han Junying], 2013, [模式识别与人工智能, Pattern Recognition and Artificial Intelligence], V26, P1057
  • [6] Jawahar G, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P3651
  • [7] Jiang S., 2018, SW FINANCE, V2, P44
  • [8] Twenty-First Century Skills in Finance: Prospects for a Profound Job Transformation
    Lavrinenko, Alina
    Shmatko, Natalia
    [J]. FORESIGHT AND STI GOVERNANCE, 2019, 13 (02) : 42 - +
  • [9] Li M., 2021, COMPUTER SYSTEM APPL, V30, P239, DOI [10.15888/j.cnki.csa.008068, DOI 10.15888/J.CNKI.CSA.008068]
  • [10] Liu X., 2018, P 27 INT C COMP LING, P1952