A Cognitive Model of Chinese Word Segmentation for Machine Translation

被引:0
|
作者
Wu, Zhijie [1 ,2 ]
机构
[1] Nanjing Univ Sci & Technol, Nanjing, Jiangsu, Peoples R China
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
关键词
Chinese word segmentation; machine translation; pragmatically-oriented language; contextual information; cognitive model;
D O I
10.7202/1008337ar
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
The Chinese language, unlike English, is written without marked word boundaries, and Chinese word segmentation is often referred to as the bottleneck for Chinese-English machine translation. The current word-segmentation systems in machine translation are either linguistically-oriented or statistically-oriented. Chinese, however, is a pragmatically-oriented language, which explains why the existing Chinese word segmentation systems in machine translation are not successful in dealing with the language. Based on a language investigation consisting of two surveys and eight interviews, and its findings concerning how Chinese people segment a Chinese sentence into words in their reading, we have developed a new word-segmentation model, aiming to address the word-segmentation problem in machine translation from a cognitive perspective.
引用
收藏
页码:631 / 644
页数:14
相关论文
共 50 条
  • [1] An Improved Method of Applying a Machine Translation Model to a Chinese Word Segmentation Task
    Wei, Yuekun
    Qu, Binbin
    Hu, Nan
    Han, Liu
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 : 44 - 54
  • [2] Adapting Chinese Word Segmentation for Machine Translation Based on Short Units
    Wang, Yiou
    Uchimoto, Kiyotaka
    Kazama, Jun'ichi
    Kruengkrai, Canasai
    Torisawa, Kentaro
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1758 - 1764
  • [3] Word Re-Segmentation in Chinese-Vietnamese Machine Translation
    Phuoc Tran
    Dien Dinh
    Nguyen, Long H. B.
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2016, 16 (02)
  • [4] Machine Reading Comprehension Model for Chinese Word Segmentation
    zhou Y.
    Chen Y.
    Huang R.
    Qin Y.
    Lin C.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2022, 56 (08): : 95 - 103
  • [5] Exploiting shared Chinese characters in Chinese word segmentation optimization for Chinese-Japanese Machine Translation
    Chu, Chenhui
    Nakazawa, Toshiaki
    Kawahara, Daisuke
    Kurohashi, Sadao
    Proceedings of the 16th Annual Conference of the European Association for Machine Translation, EAMT 2012, 2012, : 35 - 42
  • [6] An Improved Statistical Machine Translation Method for United Chinese-Japanese Word Segmentation
    Wang, Xiaowei
    Wang, Jinke
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL & ELECTRONICS ENGINEERING AND COMPUTER SCIENCE (ICEEECS 2016), 2016, 50 : 1 - 4
  • [7] Chinese to Braille Translation Based on Braille Word Segmentation Using Statistical Model
    王向东
    杨阳
    张金超
    姜文斌
    刘宏
    钱跃良
    JournalofShanghaiJiaotongUniversity(Science), 2017, 22 (01) : 82 - 86
  • [8] Chinese to Braille translation based on Braille word segmentation using statistical model
    Wang X.
    Yang Y.
    Zhang J.
    Jiang W.
    Liu H.
    Qian Y.
    Wang, Xiangdong (xdwang@ict.ac.cn), 1600, Shanghai Jiaotong University (22): : 82 - 86
  • [9] A Chinese Word Segmentation Based on Machine Learning
    Wang Hongsheng
    Cui Mingming
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL II, 2009, : 610 - 613
  • [10] Neural Chinese Word Segmentation as Sequence to Sequence Translation
    Shi, Xuewen
    Huang, Heyan
    Jian, Ping
    Guo, Yuhang
    Wei, Xiaochi
    Tang, Yi-Kun
    SOCIAL MEDIA PROCESSING, SMP 2017, 2017, 774 : 91 - 103