Deep Learning Based Part-of-Speech Tagging for Malayalam Twitter Data (Special Issue: Deep Learning Techniques for Natural Language Processing)

被引:14
|
作者
Kumar, S. [1 ]
Kumar, M. Anand [1 ]
Soman, K. P. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Ctr Computat Engn & Networking CEN, Coimbatore, Tamil Nadu, India
关键词
Part-of-speech tagging; deep learning; recurrent neural network; long short-term memory; gated recurrent unit; bidirectional LSTM;
D O I
10.1515/jisys-2017-0520
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper addresses the problem of part-of-speech (POS) tagging for Malayalam tweets. The conversational style of posts/tweets/text in social media data poses a challenge in using general POS tagset for tagging the text. For the current work, a tagset was designed that contains 17 coarse tags and 9915 tweets were tagged manually for experiment and evaluation. The tagged data were evaluated using sequential deep learning methods like recurrent neural network (RNN), gated recurrent units (GRU), long short-term memory (LSTM), and bidirectional LSTM (BLSTM). The training of the model was performed on the tagged tweets, at word level and character level. The experiments were evaluated using measures like precision, recall, f1-measure, and accuracy. During the experiment, it was found that the GRU-based deep learning sequential model at word level gave the highest f1-measure of 0.9254; at character-level, the BLSTM-based deep learning sequential model gave the highest f1-measure of 0.8739. To choose the suitable number of hidden states, we varied it as 4, 16, 32, and 64, and performed training for each. It was observed that the increase in hidden states improved the tagger model. This is an initial work to perform Malayalam Twitter data POS tagging using deep learning sequential models.
引用
收藏
页码:423 / 435
页数:13
相关论文
共 50 条
  • [31] A Natural Language Processing-Based Multimodal Deep Learning Approach for News Category Tagging
    Kumar, Bagesh
    Singh, Alankar
    Sharma, Vaidik
    Shivam, Yuvraj
    Mohan, Krishna
    Shukla, Prakhar
    Falor, Tanay
    Kumar, Abhishek
    COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT III, 2024, 2011 : 397 - 410
  • [32] Part-of-speech Tagging for Low-resource Languages: Activation Function for Deep Learning Network to Work with Minimal Training Data
    Baishya, Diganta
    Baruah, Rupam
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (05)
  • [33] Treebank based deep grammar acquisition and part-of-speech tagging for Sanskrit sentences
    NIMS University, Jaipur, Raj., India
    不详
    CSI Int. Conf. Softw. Eng., CONSEG, 2012,
  • [34] A review of deep learning techniques for speech processing
    Mehrish, Ambuj
    Majumder, Navonil
    Bharadwaj, Rishabh
    Mihalcea, Rada
    Poria, Soujanya
    INFORMATION FUSION, 2023, 99
  • [35] Part of speech tagging: a systematic review of deep learning and machine learning approaches
    Chiche, Alebachew
    Yitagesu, Betselot
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [36] Part of speech tagging: a systematic review of deep learning and machine learning approaches
    Alebachew Chiche
    Betselot Yitagesu
    Journal of Big Data, 9
  • [37] A hybrid statistical and deep learning based technique for Persian part of speech tagging
    Sara Besharati
    Hadi Veisi
    Ali Darzi
    Seyed Habib Hosseini Saravani
    Iran Journal of Computer Science, 2021, 4 (1) : 35 - 43
  • [38] Call for Papers Special Section on Deep Learning for Natural Language Processing
    Maosong Sun
    Tat-Seng Chua
    Yang Liu
    Zhiyuan Liu
    TsinghuaScienceandTechnology, 2018, 23 (03) : 366 - 366
  • [39] Automatic Generation of E-Learning Contents Based on Deep Learning and Natural Language Processing Techniques
    Wang, Yiyi
    Okamura, Koji
    ADVANCES IN INTERNET, DATA AND WEB TECHNOLOGIES (EIDWT 2020), 2020, 47 : 311 - 322
  • [40] An introduction to Deep Learning in Natural Language Processing: Models, techniques, and tools
    Lauriola, Ivano
    Lavelli, Alberto
    Aiolli, Fabio
    NEUROCOMPUTING, 2022, 470 : 443 - 456