Part-of-speech tagging using genetic algorithms

被引:0
作者
Department of Computer Science and Engineering, Lovely Professional University, Jalandhar [1 ]
Punjab, India
机构
[1] Department of Computer Science and Engineering, Lovely Professional University, Jalandhar, Punjab
来源
Int. J. Simul. Syst. Sci. Technol. | / 6卷 / 11.1-11.7期
关键词
Genetic algorithm; Natural language processing; Part of speech; Punjabi;
D O I
10.5013/IJSSST.a.16.06.11
中图分类号
学科分类号
摘要
To the best of our knowledge genetic algorithms have never been used for prediction of POS tags for Punjabi Language. In this paper, A classic Genetic Algorithm (GA) with fixed gene length is proposed for sentence-level Punjabi language tagging. It uses fixed individual size, value type encoding, Roulette wheel selection, adaptive - two point crossover (TPC) and varying mutation rate as operators in proposed work. Focusing on the relationship of tags according to context, we are proposing this technique in form of a software prototype and an algorithm. A dataset of 26,000 hand tagged words is used for proposed work and 90.63% accuracy is achieved. © 2015, UK Simulation Society. All rights reserved.
引用
收藏
页码:11.1 / 11.7
相关论文
共 12 条
  • [1] Manning C.D., Schutze H., Foundations of statistical natural language processing, (1999)
  • [2] Kashyap D., Josan G., A trigram language model to predict part of speech tags using neural network, Intelligent Data Engineering and Automated Learning IDEAL 2013, pp. 513-520, (2013)
  • [3] Sharma S., Lehal G., Using hidden markov model to improve the accuracy of punjabi pos tagger, Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on, 2, pp. 697-701, (2011)
  • [4] Gill M.S., Lehal G.S., Joshi S.S., Part of speech tagging for grammar checking of punjabi, The Linguistic Journal, 4, 1, pp. 6-21, (2009)
  • [5] Kumar D., Josan G.S., Developing a tagset for machine learning based pos tagging in punjabi, International Journal of Applied Research on Information Technology and Computing, 3, pp. 132-143, (2012)
  • [6] Yoonus M.M., Sinha S., A hybrid pos tagger for indian languages, Language in India, 11, 9
  • [7] Sankaran B., Bali K., Bhattacharya T., Bhattacharyya P., Jha G.N., Rajendran S., Saravanan K., Devi S.L., Subbarao K., Designing a common pos-tagset framework for indian languages, IJCNLP, pp. 89-92, (2008)
  • [8] Alba E., Luque G., Araujo L., Natural language tagging with genetic algorithms, Information Processing Letters, 100, 5, pp. 173-182, (2006)
  • [9] Pohlheim H., Geatbx: Genetic and evolutionary algorithm toolbox for use with matlab
  • [10] Booker L., Improving search in genetic algorithms, Genetic algorithms and simulated annealing, pp. 61-73, (1987)