Using data mining techniques and rough set theory for language modeling

被引:0
作者
Chen, Yong [1 ,3 ]
Chan, Kwok-Ping [2 ,4 ]
机构
[1] University of Hong Kong, 54th Research Institute of CTE, China, Fudan University
[2] Department of Computer Science, University of Hong Kong, Hong Kong, Pokfulam Road
[3] Department of Computer Science, University of Hong Kong
来源
ACM Transactions on Asian Language Information Processing | 2007年 / 6卷 / 01期
关键词
Chinese character recognizer; Postprocessing;
D O I
10.1145/1227850.1227852
中图分类号
学科分类号
摘要
In this article, we propose a new postprocessing strategy, word suggestion, based on a multiple word trigger-pair language model for Chinese character recognizers. With the word suggestion strategy, Chinese character recognizers may even achieve a recognition rate greater than the top-n candidate recognition rate. To construct the multiple word trigger-pair model, data mining techniques are used to alleviate the intensive computation problem. Furthermore, rough set theory is first used in the study to discover negatively correlated relationships between words in order to prevent introducing wrong words in the process of word suggestion. © 2007 ACM.
引用
收藏
相关论文
empty
未找到相关数据