Lao Named Entity Recognition based on semi-supervised cascaded Conditional Random Fields with generalized expectation criteria

被引:0
作者
Yang, Mengjie [1 ,2 ]
Zhou, Lanjiang [1 ,2 ]
Yu, Zhengtao [1 ,2 ]
Wang, Hongbin [1 ,2 ]
机构
[1] School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming
[2] Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, Kunming
来源
Journal of Computational Information Systems | 2015年 / 11卷 / 20期
关键词
Conditional Random Fields; Generalized expectation criteria; Named entity recognition; Semi-supervised learning;
D O I
10.12733/jcis16050
中图分类号
学科分类号
摘要
Named Entity Recognition (NER) is an important task in the Natural Language Processing (NLP). Now the research of NER has been relatively mature in English, Chinese etc., which recognizes the named entities by using methods of Conditional Random Fields (CRFs), Maximum Entropy Model (MEM) etc., but it is still at early stage in Lao Language due to the complexity of Lao features. For example, Lao does not have the feature of capitalization of the initial letter like English, and use space or special character as a delimiter between words etc.. Moreover, Lao labeled corpora are difficult to be obtained. So a new semi-supervised method is proposed based on Cascaded Conditional Random Fields (CCRF) by using Generalized Expectation Criteria (GE Criteria), which can express a preference of parameter setting in the CCRF. The effectiveness of the proposed method is demonstrated by performing several experiments using different feature setting and different training data. Copyright © 2015 Binary Information Press.
引用
收藏
页码:7595 / 7606
页数:11
相关论文
共 19 条
  • [1] Godeny B., Rule Based Product Name Recognition and Disambiguation, 2012 IEEe 12th International Conference on Data Mining Workshops (ICDMW), pp. 858-860, (2012)
  • [2] Li F., Du Y.J., Zhao H.Y., Feng Z.G., Two-phase Strategy of Chinese Named Entity Recognition in Micro-blog, Journal of Computational Information Systems, 10, 19, pp. 8421-8428, (2014)
  • [3] Zhang C.S., Guo J.Y., Xian Y.T., Yu Z., Lei C.Y., Wang H.X., Named Entity Recognition of the Products with English Based on Conditional Random Fields, Computer Engineering & Science, 32, 6, pp. 115-117, (2010)
  • [4] Liu X.F., Zhang X.Y., Guo C., Speech Recognition Based on M-ary Support Vector Machine, Journal of Computational Information Systems, 7, 10, pp. 3560-3566, (2011)
  • [5] Mikheev A., Moens M., Grover C., Named Entity Recognition without Gazetteers, 9th European Chapter of the Association of Computational Linguistics (EACL), pp. 1-8, (1999)
  • [6] Seon C.N., Ko Y., Kim J.S., Seo J., Named Entity Recognition using Machine Learning Methods and Pattern-Selection Rules, Natural Language Processing Pacific Rim Symposium 2001 (NLPRS2001), pp. 229-236, (2001)
  • [7] Chanlekha H., Kawtrakul A., Thai Named Entity Extraction by incorporating Maximum Entropy Model with Simple Heuristic Information, Proc. IJCNLP, (2004)
  • [8] Yao L., Sun C.J., Wu Y., Wang X.L., Wang X., Biomedical named entity recognition using generalized expectation criteria, International Journal of Machine Learning & Cybernetics, 2, 4, pp. 235-243, (2011)
  • [9] Sam R.C., Le H.T., Nguyen T.T., Nguyen T.H., Combining Proper Name-Coreference with Conditional Random Fields for Semi-supervised Named Entity Recognition in Vietnamese Tex, The 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 6634, 1, pp. 512-524, (2011)
  • [10] Huang S.L., Zheng X.L., Chen D.R., A Semi-Supervised Learning Method for Product Named Entity Recognition, Journal of Beijing University of Posts and Telecommunications, 36, 2, (2013)