Named Entity Recognition for Hungarian Using Various Machine Learning Algorithms

被引:0
|
作者
Farkas, Richard [1 ]
Szarvast, Gyorgy [2 ]
Kocsor, Andras [1 ]
机构
[1] MTA SZTE Res Grp Artificial Intelligence, Aradi Vertanuk Tere 1, H-6720 Szeged, Hungary
[2] Univ Szeged, Dept Informat, H-6720 Szeged, Hungary
来源
ACTA CYBERNETICA | 2006年 / 17卷 / 03期
关键词
named entity recognition; statistical models; machine learning;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns (Artificial Neural Network, Support Vector Machine, C4.5 Decision Tree), their combinations and the effects of dimensionality reduction as well. We used a segment of Szeged Corpus [5] for training and validation purposes, which consists of short business news articles collected from MTI (Hungarian News Agency, www.mti.hu). Our results were presented at the Second Conference on Hungarian Computational Linguistics [7]. Our system makes use of both language dependent features (describing the orthography of proper nouns in Hungarian) and other, language independent information such as capitalization. Since we avoided the inclusion of large gazetteers of pre-classified entities, the system remains portable across languages without requiring any major modification, as long as the few specialized orthographical and syntactic characteristics are collected for a new target language. The best performing model achieved an F measure accuracy of 91.95%.
引用
收藏
页码:633 / 646
页数:14
相关论文
共 50 条
  • [31] Named entity recognition in Bengali and Hindi using support vector machine
    Ekbal, Asif
    Bandyopadhyay, Sivaji
    LINGUISTICAE INVESTIGATIONES, 2011, 34 (01): : 35 - 67
  • [32] Named Entity Recognition Using a New Fuzzy Support Vector Machine
    Mansouri, Alireza
    Affendey, Lilly Suriani
    Mamat, Ali
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (02): : 320 - 325
  • [33] Quantitative Analysis of Art Market Using Ontologies, Named Entity Recognition and Machine Learning: A Case Study
    Filipiak, Dominik
    Agt-Rickauer, Henning
    Hentschel, Christian
    Filipowska, Agata
    Sack, Harald
    BUSINESS INFORMATION SYSTEMS (BIS 2016), 2016, 255 : 79 - 90
  • [34] Using machine learning to maintain rule-based named-entity recognition and classification systems
    Petasis, G
    Vichot, F
    Wolinski, F
    Paliouras, G
    Karkaletsis, V
    Spyropoulos, CD
    39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2001, : 418 - 425
  • [35] Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
    Xu, Qian
    Zhou, Yue
    Liao, Bolin
    Xin, Zirui
    Xie, Wenzhao
    Hu, Chao
    Luo, Aijing
    BIOENGINEERING-BASEL, 2023, 10 (06):
  • [36] Learning In-context Learning for Named Entity Recognition
    Chen, Jiawei
    Lu, Yaojie
    Lin, Hongyu
    Lou, Jie
    Jia, Wei
    Dai, Dai
    Wu, Hua
    Cao, Boxi
    Han, Xianpei
    Sun, Le
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13661 - 13675
  • [37] Various criteria in the evaluation of biomedical named entity recognition
    Tsai, RTH
    Wu, SH
    Chou, WC
    Lin, YC
    He, D
    Hsiang, J
    Sung, TY
    Hsu, WL
    BMC BIOINFORMATICS, 2006, 7 (1) : 1 - 8
  • [38] A Survey on Deep Learning for Named Entity Recognition
    Li, Jing
    Sun, Aixin
    Han, Jianglei
    Li, Chenliang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (01) : 50 - 70
  • [39] Various criteria in the evaluation of biomedical named entity recognition
    Richard Tzong-Han Tsai
    Shih-Hung Wu
    Wen-Chi Chou
    Yu-Chun Lin
    Ding He
    Jieh Hsiang
    Ting-Yi Sung
    Wen-Lian Hsu
    BMC Bioinformatics, 7
  • [40] A Comparison of Performance of Sequential Learning Algorithms on the Task of Named Entity Recognition for Indian Languages
    Krishnarao, Awaghad Ashish
    Gahlot, Himanshu
    Srinet, Amit
    Kushwaha, D. S.
    COMPUTATIONAL SCIENCE - ICCS 2009, PART I, 2009, 5544 : 123 - 132