Neural Chinese word segmentation with dictionary

被引:28
|
作者
Liu, Junxin [1 ]
Wu, Fangzhao [2 ]
Wu, Chuhan [1 ]
Huang, Yongfeng [1 ]
Xie, Xing [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese word segmentation; Dictionary; Neural network;
D O I
10.1016/j.neucom.2019.01.085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Chinese word segmentation (CWS) is an important task for Chinese NLP. Recently, many neural network based methods have been proposed for Chinese word segmentation. However, these methods require a large number of labeled sentences for model training, and usually cannot utilize the useful information in Chinese dictionary. In this paper, we propose two methods to exploit the dictionary information for CWS. The first one is based on pseudo labeled data generation, and the second one is based on multi-task learning. The experimental results on two benchmark datasets validate that our approach can effectively improve the performance of Chinese word segmentation, especially when training data is insufficient. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:46 / 54
页数:9
相关论文
共 50 条
  • [1] Neural Chinese Word Segmentation with Dictionary Knowledge
    Liu, Junxin
    Wu, Fangzhao
    Wu, Chuhan
    Huang, Yongfeng
    Xie, Xing
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 80 - 91
  • [2] Research on Dictionary for Personalized Chinese Word Segmentation
    Jiang, Huanjun
    Ren, Xiang
    Liu, Ke
    MODERN TECHNOLOGIES IN MATERIALS, MECHANICS AND INTELLIGENT SYSTEMS, 2014, 1049 : 1868 - +
  • [3] An Optimization Algorithm of Chinese Word Segmentation Based on Dictionary
    Tang, Jun
    Wu, Qing
    Li, Yinghong
    2015 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2015, : 259 - 262
  • [4] Neural Word Segmentation Learning for Chinese
    Cai, Deng
    Zhao, Hai
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 409 - 420
  • [5] An ambiguity discovery algorithm on Chinese word segmentation based dictionary
    Sun, Tieli
    Liu, Yanji
    Yang, Lehua
    Li, Zhiying
    Liu, Zhenghong
    PROCEEDINGS OF THE 2009 SECOND PACIFIC-ASIA CONFERENCE ON WEB MINING AND WEB-BASED APPLICATION, 2009, : 39 - 42
  • [6] Neural Domain Adaptation or Chinese Word Segmentation
    Bao, Zuyi
    Li, Si
    Xu, Weiran
    Gao, Sheng
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 131 - 134
  • [7] Fast and Accurate Neural Word Segmentation for Chinese
    Cai, Deng
    Zhao, Hai
    Zhang, Zhisong
    Xin, Yuan
    Wu, Yongjian
    Huang, Feiyue
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 608 - 615
  • [8] A kind of dictionary mechanism based on the two-word-bitmap for Chinese word segmentation
    College of Computer and Communication, Hunan Univ., Changsha 410082, China
    Hunan Daxue Xuebao, 2006, 1 (121-123):
  • [9] Neural Chinese Word Segmentation as Sequence to Sequence Translation
    Shi, Xuewen
    Huang, Heyan
    Jian, Ping
    Guo, Yuhang
    Wei, Xiaochi
    Tang, Yi-Kun
    SOCIAL MEDIA PROCESSING, SMP 2017, 2017, 774 : 91 - 103
  • [10] Neural Networks Incorporating Dictionaries for Chinese Word Segmentation
    Zhang, Qi
    Liu, Xiaoyu
    Fu, Jinlan
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5682 - 5689