Large margin cost-sensitive learning of conditional random fields

被引:15
作者
Kim, Minyoung [1 ]
机构
[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
关键词
Conditional random fields; Cost-sensitive learning; CLASSIFICATION;
D O I
10.1016/j.patcog.2010.05.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the structured output classification problem using the Conditional Random Fields (CRFs). Unlike the standard 0/1 loss case, we consider a cost-sensitive learning setting where we are given a non-0/1 misclassification cost matrix at the individual output level. Although the task of cost-sensitive classification has many interesting practical applications that retain domain-specific scales in the output space (e.g., hierarchical or ordinal scale), most CRF learning algorithms are unable to effectively deal with the cost-sensitive scenarios as they merely assume a nominal scale (hence 0/1 loss) in the output space. In this paper, we incorporate the cost-sensitive loss into the large margin learning framework. By large margin learning, the proposed algorithm inherits most benefits from the SVM-like margin-based classifiers, such as the provable generalization error bounds. Moreover, the soft-max approximation employed in our approach yields a convex optimization similar to the standard CRF learning with only slight modification in the potential functions. We also provide the theoretical cost-sensitive generalization error bound. We demonstrate the improved prediction performance of the proposed method over the existing approaches in a diverse set of sequence/image structured prediction problems that often arise in pattern recognition and computer vision domains. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3683 / 3692
页数:10
相关论文
共 30 条
[1]  
ALTUN Y, 2003, INT C MACH LEARN WAS
[2]  
[Anonymous], 2001, Journal of Machine Learning Research
[3]  
[Anonymous], 2003, Exploring artificial intelligence in the new millennium, DOI DOI 10.5555/779343.779352
[4]   The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].
Bartlett, PL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536
[5]   Semantic object classes in video: A high-definition ground truth database [J].
Brostow, Gabriel J. ;
Fauqueur, Julien ;
Cipolla, Roberto .
PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97
[6]  
Brostow Gabriel J, 2008, ECCV, P44, DOI [DOI 10.1007/978-3-540-88682-2_5, DOI 10.1007/978-3-540-88682-2-5]
[7]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[8]  
CHU W, 2005, INT C MACH LEARN BON
[9]  
Domingos P., 1999, P ACM SIGKDD INT C K, P155, DOI DOI 10.1145/312129.312220
[10]  
Elkan C., 2001, In Pro- ceedings of the Seventeenth International Joint Conference on Artificial Intelligence, P973, DOI DOI 10.5555/1642194.1642224