Information Extraction: Evaluating Named Entity Recognition from Classical Malay Documents

被引:0
|
作者
Sazali, Siti Syakirah [1 ]
Rahman, Nurazzah Abdul [1 ]
Abu Bakar, Zainab [2 ]
机构
[1] Univ Teknol MARA, Fac Comp & Math Sci, Shah Alam, Selangor, Malaysia
[2] Al Madinah Int Univ, Fac Comp & Informat Technol, Shah Alam, Selangor, Malaysia
来源
2016 THIRD INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP) | 2016年
关键词
component; bahasa melayu; information extraction; malay language; named entity recognition; natural language processing; nouns; nouns extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Natural Language Processing (NLP) is an important field of research in Computer Science. NLP is the process of analyzing texts based on a set of theories and technologies, and recent studies focused more on Information Extraction (IE). In Information Extraction, there are few steps or commonly known as task to be followed, which are named entity recognition, relation detection and classification, temporal and event processing, and template filling. Recent researches in Malay languages mainly focused on newspaper articles and since this research experiment is experimenting on classical documents, there is a need to identify the best way to extract noun from existing methods. This paper proposes to conduct a research about extracting nouns from Malay classical documents. The result shows that experiment using the Noun Extraction using Morphological Rules (Verb, Adjective and Noun Affixes) that has 77.61% chances of identifying a noun to contribute to the existing Malay noun list. As there is not any existing completed Malay noun list or dictionary that can be used as a guide, the results extracted still need to be judged by the language experts.
引用
收藏
页码:48 / 53
页数:6
相关论文
共 50 条
  • [1] Exploiting Named Entity Recognition for Information Extraction from Italian Procurement Documents: A Case Study
    Impedovo, Angelo
    Barracchia, Emanuele Pio
    Rizzo, Giuseppe
    INFORMATION INTEGRATION AND WEB INTELLIGENCE, IIWAS 2022, 2022, 13635 : 60 - 74
  • [2] Named Entity Recognition and Relation Detection for Biomedical Information Extraction
    Perera, Nadeesha
    Dehmer, Matthias
    Emmert-Streib, Frank
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2020, 8
  • [3] A Malay Named Entity Recognition Using Conditional Random Fields
    Salleh, Muhammad Sharilazlan
    Asmai, Siti Azirah
    Basiron, Halizah
    Ahmad, Sabrina
    2017 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOIC7), 2017,
  • [4] Named Entity Recognition Approach for Malay Crime News Retrieval
    Saad, Saidah
    Mansor, Mohamed Kamil
    GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2018, 18 (04): : 216 - 235
  • [5] Named Entity Recognition: A Review for Key Information Extraction
    Nandini, P.
    Jairam, Bhat Geetalaxmi
    THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 427 - 437
  • [6] Named Entity Recognition in Unstructured Medical Text Documents
    Pearson, Cole
    Seliya, Naeem
    Dave, Rushit
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 412 - 417
  • [7] Named Entity Recognition via Unified Information Extraction Framework
    Chen, Xinyue
    Zhang, Zhenguo
    Lu, Xinghua
    2024 4TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE, CCAI 2024, 2024, : 308 - 313
  • [8] Chinese Data Extraction and Named Entity Recognition
    Yang, Tingwei
    Jiang, Daguang
    Shi, Shenghui
    Than, Siyan
    Zhuo, Lin
    Yin, Yukang
    Liang, Zheng
    2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020), 2020, : 105 - 109
  • [9] Named Entity Recognition from Structured Data in Enterprise Documents
    Liang, Yaobo
    Chen, Shuoying
    Chen, Fengjiao
    Ji, Lei
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING APPLICATIONS (CSEA 2015), 2015, : 253 - 259
  • [10] Evaluation of Named Entity Recognition in Handwritten Documents
    Villanova-Aparisi, David
    Martinez-Hinarejos, Carlos-D
    Romero, Veronica
    Pastor-Gadea, Moises
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 568 - 582