Applying class triggers in Chinese pos tagging based on maximum entropy model

被引:0
作者
Zhao, Y [1 ]
Wang, XL [1 ]
Liu, BQ [1 ]
Guan, Y [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
来源
PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年
关键词
Chinese POS tagging; trigger; average mutual information; maximum entropy;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A method of applying class triggers in Chinese POS tagging based on Maximum Entropy model is proposed in this paper. First of all, Feature template of "word->word/tat" is used to extract the triggers from corpus and the triggers that we extracted are added into the Maximum Entropy model as a new kind of feature. Then, the average mutual information is applied to make feature selection and the semantic lexicon is used to build class triggers to overcome sparseness problem. Meanwhile, A solution based on experience to deal with over-fitting problem in model training is presented. Finally, the performance of the system is evaluated on a manually annotated POS tag corpus. The experiment demonstrates that the method can provide increase of accuracy of POS tagging from 94% to 96%, compared our new model with HMM model that is smoothed by absolute smoothing.
引用
收藏
页码:1641 / 1645
页数:5
相关论文
共 15 条
  • [1] ADWAIT R, 1996, P EMNLP NEW BRUNSW N
  • [2] ADWAIT R, 1998, THESIS U PENNSYLVANI
  • [3] [Anonymous], P AAAI 94
  • [4] [Anonymous], COMPUTATIONAL LINGUI
  • [5] GENERALIZED ITERATIVE SCALING FOR LOG-LINEAR MODELS
    DARROCH, JN
    RATCLIFF, D
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1972, 43 (05): : 1470 - &
  • [6] JELINEK F, 1994, P 1994 HUM LANG TECH, P272
  • [7] JIAN Z, 2002, IEEE P 2002 INT C MA
  • [8] Magerman D. M., 1995, P 33 ANN M ACL
  • [9] Manning C., 1999, Foundations of Statistical Natural Language Processing
  • [10] McCallum A., 2000, Icml, V17, P591