Multi-level boundary classification for information extraction

被引:0
作者
Finn, A [1 ]
Kushmerick, N [1 ]
机构
[1] Univ Coll Dublin, Comp Sci Dept, Smart Media Inst, Dublin, Ireland
来源
MACHINE LEARNING: ECML 2004, PROCEEDINGS | 2004年 / 3201卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several different feature-sets to build a set of classifiers for IE. We show that this approach is competitive with current state-of-the-art IE algorithms based on specialized learning algorithms. We also introduce a new technique for improving the recall of our IE algorithm. This approach uses a two-level ensemble of classifiers to improve the recall of the extracted fragments while maintaining high precision. We show that this approach outperforms current state-of-the-art IE algorithms on several benchmark IE tasks.
引用
收藏
页码:111 / 122
页数:12
相关论文
共 14 条
[1]  
[Anonymous], 1990, SUPPORT VECTOR LEARN
[2]  
BRILL E, 1994, AAAI
[3]  
Califf M. E., 1999, P 16 NAT C ART INT
[4]  
CIRAVEGNA F, 2001, P 17 INT JOINT C ART
[5]  
Cohen W., 1995, ICML
[6]  
FREITAG D, 1998, THESIS CARNEGIE MELL
[7]  
FREITAG D, 2000, P 17 NAT C ART INT A
[8]  
LAVELLI A, 2004, 4 INT C LANG RES EV
[9]  
Littlestone N., 1988, Machine Learning, V2, P285, DOI 10.1007/BF00116827
[10]  
PESHKIN L, 2003, P 18 INT JOINT C ART