Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition

被引:8
作者
Li, Zhen [1 ]
Qu, Dan [1 ]
Xie, Chaojie [2 ]
Zhang, Wenlin [1 ]
Li, Yanxia [3 ]
机构
[1] PLA Strateg Support Force Informat Engn Univ, Informat Syst Engn Coll, 93 Hightech Zone, Zhengzhou 450000, Peoples R China
[2] Zhengzhou Xinda Inst Adv Technol, 93 Hightech Zone, Zhengzhou 450000, Peoples R China
[3] PLA Strateg Support Force Informat Engn Univ, Foreign Languages Coll, 93 Hightech Zone, Zhengzhou 450000, Peoples R China
关键词
Unsupervised machine translation; language model; named entity recognition;
D O I
10.1142/S0218213020400217
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural Machine Translation (NMT) model has become the mainstream technology in machine translation. The supervised neural machine translation model trains with abundant of sentence-level parallel corpora. But for low-resources language or dialect with no such corpus available, it is difficult to achieve good performance. Researchers began to focus on unsupervised neural machine translation (UNMT) that monolingual corpus as training data. UNMT need to construct the language model (LM) which learns semantic information from the monolingual corpus. This paper focuses on the pre-training of LM in unsupervised machine translation and proposes a pre-training method, NER-MLM (named entity recognition masked language model). Through performing NER, the proposed method can obtain better semantic information and language model parameters with better training results. In the unsupervised machine translation task, the BLEU scores on the WMT'16 English-French, English-German, data sets are 35.30, 27.30 respectively. To the best of our knowledge, this is the highest results in the field of UNMT reported so far.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Virus Named Entity Recognition based on Pre-training Model
    Mou, Hanlin
    Zheng, Shanshan
    Wu, Haifang
    Li, Bojing
    He, Tingting
    Jiang, Xingpeng
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 473 - 476
  • [2] Low-Resource Named Entity Recognition via the Pre-Training Model
    Chen, Siqi
    Pei, Yijie
    Ke, Zunwang
    Silamu, Wushour
    SYMMETRY-BASEL, 2021, 13 (05):
  • [3] PTWA: Pre-training with Word Attention for Chinese Named Entity Recognition
    Ma, Kaixin
    Liu, Meiling
    Zhao, Tiejun
    Zhou, Jiyun
    Yu, Yang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [4] Bacterial Named Entity Recognition Based on Language Model
    Li, Xusheng
    Fu, Chengcheng
    Zhong, Ran
    Zhong, Duo
    He, Tingling
    Jiang, Xingpeng
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 2715 - 2721
  • [5] Named entity Recognition Model for Punjabi Language: A Survey
    Kaur, Pawandeep
    Kaur, Amandeep
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2016, : 887 - 891
  • [6] Recent Progress on Named Entity Recognition Based on Pre-trained Language Models
    Yang, Binxia
    Luo, Xudong
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 799 - 804
  • [7] Joint Pre-Trained Chinese Named Entity Recognition Based on Bi-Directional Language Model
    Ma, Changxia
    Zhang, Chen
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (09)
  • [8] HMM based Named Entity Recognition for Inflectional Language
    Patil, Nita V.
    Patil, Ajay S.
    Pawar, B. V.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND ELECTRONICS (COMPTELIX), 2017, : 565 - 572
  • [9] Language model based on deep learning network for biomedical named entity recognition
    Hou, Guan
    Jian, Yuhao
    Zhao, Qingqing
    Quan, Xiongwen
    Zhang, Han
    METHODS, 2024, 226 : 71 - 77
  • [10] A Method of Chinese Tourism Named Entity Recognition Based on BBLC Model
    Xue, Leyi
    Cao, Han
    Ye, Fan
    Qin, Yuehua
    2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 1722 - 1727