NAMED-ENTITY RECOGNITION FOR HINDI LANGUAGE USING CONTEXT PATTERN-BASED MAXIMUM ENTROPY

被引:2
|
作者
Jain, Arti [1 ]
Yadav, Divakar [2 ]
Arora, Anuja [1 ]
Tayal, Devendra K. [3 ]
机构
[1] Jaypee Inst Informat Technol, Noida, Uttar Pradesh, India
[2] NIT Hamirpur, Hamirpur, Himachal Prades, India
[3] Indira Gandhi Delhi Tech Univ Women, New Delhi, India
来源
COMPUTER SCIENCE-AGH | 2022年 / 23卷 / 01期
关键词
context patterns; gazetteer lists; Hindi language; Kaggle dataset; maximum entropy; named-entity recognition; feature extension; HYBRID APPROACH; SYSTEM;
D O I
10.7494/csci.2022.23.1.3977
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper describes a named-entity-recognition (NER) system for the Hindi language that uses two methodologies: an existing baseline maximum entropy-based named-entity (BL-MENE) model, and the proposed context pattern-based MENE (CP-MENE) framework. BL-MENE utilizes several baseline features for the NER task but suffers from inaccurate named-entity (NE) boundary detection, misclassification errors, and the partial recognition of NEs due to certain missing essentials. However, the CP-MENE-based NER task incorporates extensive features and patterns that are set to overcome these problems. In fact, CP-MENE's features include right-boundary, left-boundary, part-of-speech, synonym, gazetteer and relative pronoun features. CP-MENE formulates a kind of recursive relationship for extracting highly ranked NE patterns that are generated through regular expressions via Python (C) code. Since the web content of the Hindi language is arising nowadays (especially in health care applications), this work is conducted on the Hindi health data (HHD) corpus (which is readily available from the Kaggle dataset). Our experiments were conducted on four NE categories; namely, Person (PER), Disease (DIS), Consumable (CNS), and Symptom (SMP).
引用
收藏
页码:81 / 115
页数:35
相关论文
共 50 条
  • [41] Recent Progress on Named Entity Recognition Based on Pre-trained Language Models
    Yang, Binxia
    Luo, Xudong
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 799 - 804
  • [42] Research on Named Entity Recognition for Spoken Language Understanding Using Adversarial Transfer Learning
    Guo, Yao
    Li, Meng
    Li, Yanling
    Ge, Fengpei
    Qi, Yaohui
    Lin, Min
    ELECTRONICS, 2023, 12 (04)
  • [43] Enhanced neurologic concept recognition using a named entity recognition model based on transformers
    Azizi, Sima
    Hier, Daniel B.
    Wunsch II, Donald C. C.
    FRONTIERS IN DIGITAL HEALTH, 2022, 4
  • [44] Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
    Zhang, Min
    Geng, Guohua
    Chen, Jing
    ENTROPY, 2020, 22 (02)
  • [45] Transfer Learning for Named Entity Recognition in Setswana Language Using CNN-BiLSTM Model
    Chabalala, Shumile
    Ojo, Sunday O.
    Owolawi, Pius A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (02) : 472 - 481
  • [46] Chinese Named Entity Recognition using a Morpheme-based Chunking Tagger
    Fu, Guohong
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 289 - 292
  • [47] Simultaneous Character-Cluster-Based Word Segmentation and Named Entity Recognition in Thai Language
    Tongtep, Nattapong
    Theeramunkong, Thanaruk
    KNOWLEDGE, INFORMATION, AND CREATIVITY SUPPORT SYSTEMS, 2011, 6746 : 216 - 225
  • [48] Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
    Wang, Peng
    Yang, Yifan
    Bang, Zheng
    Tan, Tian
    Zhang, Shiliang
    Chen, Xie
    INTERSPEECH 2024, 2024, : 742 - 746
  • [49] Text Summarization based Named Entity Recognition for Certain Application using BERT
    Tummala, Indira Priyadarshini
    2024 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBER PHYSICAL SYSTEMS AND INTERNET OF THINGS, ICOICI 2024, 2024, : 1136 - 1141
  • [50] Sentence-based undersampling for named entity recognition using genetic algorithm
    Abbas Akkasi
    Iran Journal of Computer Science, 2018, 1 (3) : 165 - 174