Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms

被引:0
|
作者
Farkas, Richard [1 ]
Szarvast, Gyorgy [2 ]
Kocsor, Andras [1 ]
机构
[1] MTA SZTE Res Grp Artificial Intelligence, Aradi Vertanuk Tere 1, H-6720 Szeged, Hungary
[2] Univ Szeged, Dept Informat, H-6720 Szeged, Hungary
来源
ACTA CYBERNETICA | 2006年 / 17卷 / 03期
关键词
named entity recognition; statistical models; machine learning;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns (Artificial Neural Network, Support Vector Machine, C4.5 Decision Tree), their combinations and the effects of dimensionality reduction as well. We used a segment of Szeged Corpus [5] for training and validation purposes, which consists of short business news articles collected from MTI (Hungarian News Agency, www.mti.hu). Our results were presented at the Second Conference on Hungarian Computational Linguistics [7]. Our system makes use of both language dependent features (describing the orthography of proper nouns in Hungarian) and other, language independent information such as capitalization. Since we avoided the inclusion of large gazetteers of pre-classified entities, the system remains portable across languages without requiring any major modification, as long as the few specialized orthographical and syntactic characteristics are collected for a new target language. The best performing model achieved an F measure accuracy of 91.95%.
引用
收藏
页码:633 / 646
页数:14
相关论文
共 50 条
  • [1] Named Entity Recognition in Crime Using Machine Learning Approach
    Shabat, Hafedh
    Omar, Nazlia
    Rahem, Khmael
    INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2014, 2014, 8870 : 280 - 288
  • [2] Resolving Ambiguities in Named Entity Recognition Using Machine Learning
    Bhandari, Nitin
    Chowdri, Ritika
    Singh, Harmeet
    Qureshi, Salim Raza
    2017 INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING AND INFORMATION SYSTEMS (ICNGCIS), 2017, : 159 - 163
  • [3] Named entity recognition using hybrid machine learning approach
    Chiong, Raymond
    Wei, Wang
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 578 - 583
  • [4] Named entity recognition in crime using machine learning approach
    Shabat, Hafedh (h2005_ali@yahoo.com), 1600, Springer Verlag (8870):
  • [5] Named Entity Recognition using Machine Learning Techniques for Telugu language
    Khanam, M. Humera
    Khudhus, Md A.
    Babu, M. S. Prasad
    PROCEEDINGS OF 2016 IEEE 7TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2016), 2016, : 940 - 944
  • [6] Investigation of Data Representation Methods with Machine Learning Algorithms for Biomedical Named Entity Recognition
    Abd, Maan Tareq
    Mohd, Masnizah
    Abd, Mustafa Tareq
    2018 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP), 2018, : 54 - 59
  • [7] Named entity recognition based on a machine learning model
    Wang, Jing
    Liu, Zhijing
    Zhao, Hui
    Research Journal of Applied Sciences, Engineering and Technology, 2012, 4 (20) : 3973 - 3980
  • [8] Active Machine Learning Technique For Named Entity Recognition
    Ekbal, Asif
    Saha, Sriparna
    Singh, Dhirendra
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 180 - 186
  • [9] A Comparative Study of Named Entity Recognition for Hindi Using Sequential Learning Algorithms
    Krishnarao, Awaghad Ashish
    Gahlot, Himanshu
    Srinet, Amit
    Kushwaha, D. S.
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1163 - 1168
  • [10] An effective undersampling method for biomedical named entity recognition using machine learning
    Archana, S. M.
    Prakash, Jay
    EVOLVING SYSTEMS, 2024, 15 (04) : 1541 - 1549