Information Extraction: Evaluating Named Entity Recognition from Classical Malay Documents

被引:0
|
作者
Sazali, Siti Syakirah [1 ]
Rahman, Nurazzah Abdul [1 ]
Abu Bakar, Zainab [2 ]
机构
[1] Univ Teknol MARA, Fac Comp & Math Sci, Shah Alam, Selangor, Malaysia
[2] Al Madinah Int Univ, Fac Comp & Informat Technol, Shah Alam, Selangor, Malaysia
来源
2016 THIRD INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP) | 2016年
关键词
component; bahasa melayu; information extraction; malay language; named entity recognition; natural language processing; nouns; nouns extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Natural Language Processing (NLP) is an important field of research in Computer Science. NLP is the process of analyzing texts based on a set of theories and technologies, and recent studies focused more on Information Extraction (IE). In Information Extraction, there are few steps or commonly known as task to be followed, which are named entity recognition, relation detection and classification, temporal and event processing, and template filling. Recent researches in Malay languages mainly focused on newspaper articles and since this research experiment is experimenting on classical documents, there is a need to identify the best way to extract noun from existing methods. This paper proposes to conduct a research about extracting nouns from Malay classical documents. The result shows that experiment using the Noun Extraction using Morphological Rules (Verb, Adjective and Noun Affixes) that has 77.61% chances of identifying a noun to contribute to the existing Malay noun list. As there is not any existing completed Malay noun list or dictionary that can be used as a guide, the results extracted still need to be judged by the language experts.
引用
收藏
页码:48 / 53
页数:6
相关论文
共 50 条
  • [31] Improving biomedical named entity recognition with syntactic information
    Yuanhe Tian
    Wang Shen
    Yan Song
    Fei Xia
    Min He
    Kenli Li
    BMC Bioinformatics, 21
  • [32] Named entity recognition for Chinese judgment documents based on BiLSTM and CRF
    Huang, Wenming
    Hu, Dengrui
    Deng, Zhenrong
    Nie, Jianyun
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2020, 2020 (01)
  • [33] Improving biomedical named entity recognition with syntactic information
    Tian, Yuanhe
    Shen, Wang
    Song, Yan
    Xia, Fei
    He, Min
    Li, Kenli
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [34] Ontology Extraction from Software Requirements Using Named-Entity Recognition
    Kocerka, Jerzy
    Krzeslak, Michal
    Galuszka, Adam
    ADVANCES IN SCIENCE AND TECHNOLOGY-RESEARCH JOURNAL, 2022, 16 (03) : 207 - 212
  • [35] FoodIE: A Rule-based Named-entity Recognition Method for Food Information Extraction
    Popovski, Gorjan
    Kochev, Stefan
    Seljak, Barbara Korousic
    Eftimov, Tome
    ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2019, : 915 - 922
  • [36] A Survey on Recent Named Entity Recognition and Relationship Extraction Techniques on Clinical Texts
    Bose, Priyankar
    Srinivasan, Sriram
    Sleeman, William C.
    Palta, Jatinder
    Kapoor, Rishabh
    Ghosh, Preetam
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [37] Named Entity Recognition from Table Headers in Randomized Controlled Trial Articles
    Wei, Qiang
    Zhou, Yujia
    Zhao, Bo
    Hu, Xinyue
    Mei, Qiaozhu
    Tao, Cui
    Xu, Hua
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 533 - 534
  • [38] Information Extraction and Named Entity Recognition Supported Social Media Sentiment Analysis during the COVID-19 Pandemic
    Nemes, Laszlo
    Kiss, Attila
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [39] A Bank Information Extraction System Based on Named Entity Recognition with CRFs from Noisy Customer Order Texts in Turkish
    Emekligil, Erdem
    Arslan, Secil
    Agin, Onur
    KNOWLEDGE ENGINEERING AND SEMANTIC WEB, KESW 2016, 2016, 649 : 93 - 102
  • [40] Named entity recognition on bio-medical literature documents using hybrid based approach
    Ramachandran, R.
    Arutchelvan, K.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021,