BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging

被引:1
作者
Ovi, Jesan Ahammed [1 ]
Islam, Md Ashraful [1 ]
Karim, Md Rezaul [1 ]
机构
[1] Univ Dhaka, Dept Comp Sci & Engn, Dhaka 1000, Bangladesh
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Tagging; Sequential analysis; Feature extraction; Labeling; Hidden Markov models; Machine translation; Natural language processing; Speech processing; Neural networks; Information retrieval; Recurrent neural networks; Text recognition; Bangla; POS tagging; RNN; sequence labeling;
D O I
10.1109/ACCESS.2022.3208269
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In Natural Language Processing, Parts-of-Speech tagging is a vital component that significantly impacts applications like machine translation, spell-checker, information retrieval, and speech processing. In languages such as English and Dutch, POS tagging is considered a solved problem (accuracy: 97%). However, for low-resource languages like Bangla, challenges are still there. In this article, we have proposed a novel RNN-based network named BaNeP to determine parts of speech for Bangla words. The proposed network extracts structural features through a bidirectional LSTM-based sub-network, and intricate contextual relations among words of a sentence are identified through an elaborate weighted context extraction procedure. These features are then combinedly utilized to generate the final Parts-of-Speech prediction. Training the model requires only an annotated dataset vanishing the need for any hand-crafted features. Experimental results on the LDC2010T16 dataset show significant accuracy improvement compared to existing Bangla POS taggers.
引用
收藏
页码:102753 / 102769
页数:17
相关论文
共 46 条
  • [1] Akbik A., 2018, COLING 2018, 27th International Conference on Computational Linguistics, P1638
  • [2] Parts-of-Speech tagging for Malayalam using deep learning techniques
    Akhil K.K.
    Rajimol R.
    Anoop V.S.
    [J]. International Journal of Information Technology, 2020, 12 (3) : 741 - 748
  • [3] Alam F, 2016, INT CONF COMPUT INFO, P377, DOI 10.1109/ICCITECHN.2016.7860227
  • [4] Anbananthen K.S.M., 2017, Amer. J. Appl. Sci., V14, P843, DOI 10.3844/ajassp.2017.843.851
  • [5] Awasthi P., 2006, P NLP ASS IND NLPAI
  • [6] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [7] Bali K., 2010, INDIAN LANGUAGE PART
  • [8] ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis
    Basiri, Mohammad Ehsan
    Nemati, Shahla
    Abdar, Moloud
    Cambria, Erik
    Acharya, U. Rajendra
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 115 : 279 - 294
  • [9] Bate W. J., 2013, BURDEN ENGLISH POET
  • [10] Bhattacharya Paheli, 2013, Mining Intelligence and Knowledge Exploration. First International Conference, MIKE 2013. Proceedings: LNCS 8284, P799, DOI 10.1007/978-3-319-03844-5_78