Support vector learning for semantic argument classification

被引:84
作者
Pradhan, S [1 ]
Hacioglu, K [1 ]
Krugler, V [1 ]
Ward, W [1 ]
Martin, J [1 ]
Jurafsky, D [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Boulder, CO 80303 USA
基金
美国国家科学基金会;
关键词
shallow semantic parsing; support vector machines;
D O I
10.1007/s10994-005-0912-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing-the process of assigning a WHO did WHAT to WHOM, WHEN, WHERE, WHY, How etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning specific labels to them. It could play a key role in NLP tasks like Information Extraction, Question Answering and Summarization. We propose a machine learning algorithm for semantic role parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give large improvement in performance over earlier classifiers. We show performance improvements through a number of new features designed to improve generalization to unseen data, such as automatic clustering of verbs. We also report on various analytic studies examining which features are most important, comparing our classifier to other machine learning algorithms in the literature, and testing its generalization to new test set from different genre. On the task of assigning semantic labels to the PropBank (Kingsbury, Palmer, & Marcus, 2002) corpus, our final system has a precision of 84% and a recall of 75%, which are the best results currently reported for this task. Finally, we explore a completely different architecture which does not requires a deep syntactic parse. We reformulate the task as a combined chunking and classification problem, thus allowing our algorithm to be applied to new languages or genres of text for which statistical syntactic parsers may not be available.
引用
收藏
页码:11 / 39
页数:29
相关论文
共 39 条
[1]   Reducing multiclass to binary: A unifying approach for margin classifiers [J].
Allwein, EL ;
Schapire, RE ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :113-141
[2]  
[Anonymous], ADV KERNEL METHODS
[3]  
[Anonymous], 1995, P 3 ACL WORKSH VER L
[4]  
[Anonymous], TRCSLR20031
[5]  
[Anonymous], P HUM LANG TECHN C N
[6]  
[Anonymous], 2000, P 2 WORKSH LEARN LOG
[7]  
[Anonymous], 2005, EUR C MACH LEARN
[8]  
[Anonymous], P 1 ANN M N AM CHAPT
[9]  
Baker C.F., 1998, P 36 ANN M ASS COMP, P86, DOI DOI 10.3115/980845.980860
[10]   An algorithm that learns what's in a name [J].
Bikel, DM ;
Schwartz, R ;
Weischedel, RM .
MACHINE LEARNING, 1999, 34 (1-3) :211-231