Named-Entity Techniques for Terrorism Event Extraction and Classification

被引:7
作者
Inyaem, Uraiwan [1 ]
Meesad, Phayung [2 ]
Haruechaiyasak, Choochart [3 ]
机构
[1] King Mongkuts Univ Technol North Bangkok, Fac Informat Technol, Bangkok 10800, Thailand
[2] Univ Technol North Bangkok, Bangkok, Thailand
[3] Natl Elect & Comp Technol Ctr, Human Language Technol Lab, Pathum Thani 12120, Thailand
来源
2009 EIGHTH INTERNATIONAL SYMPOSIUM ON NATURAL LANGUAGE PROCESSING, PROCEEDINGS | 2009年
关键词
D O I
10.1109/SNLP.2009.5340924
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The aim of this paper is to study and compare several machine learning methods for implementing a Thai terrorism event extraction system. The main function of the system is to extract information related to terrorism events found in Thai news articles. The terrorism events can then be classified and presented to intelligence officers who can further analyze and predict terrorism events. This paper compares three named entity feature selection techniques provided by terrorism gazetteer, terrorism ontology and terrorism grammar rules, for entity recognition. The machine learning algorithms use for event extraction include Naive Bayes (NB), K Nearest Neighbor (KNN), Decision Tree (DTREE) and Support Vector Machines (SVM). Each term feature is weighted by using the Term Frequency-Inverse Document Frequency (TF-IDF). Finite State Transduction is applied for learning feature weights. Experimental results show that the SVM algorithm with a terrorism ontology feature selection yields the best performance with 69.90% for both precision and recall.
引用
收藏
页码:175 / +
页数:2
相关论文
共 14 条
  • [1] [Anonymous], 1997, Machine Learning
  • [2] [Anonymous], AUTOMATED EVENT EXTR
  • [3] CALIFF ME, 2003, J MACHINE LEARNING R, V4, P177
  • [4] CASTRO E, 2001, XML WORLD WILD WEB V, P70
  • [5] CUNNINGHAM H, 2002, ELSNEWS, V11
  • [6] DALLI A, UK SPECIAL INT UNPUB
  • [7] FELDMAN R, 2007, TEXT MINING HDB ADV, P70
  • [8] GRISHMAN R, P ISCA IEEE WO UNPUB
  • [9] INYAEM U, P NCCIT C UNPUB
  • [10] LI Y, P NTCIR 6 WORK UNPUB