Large margin cost-sensitive learning of conditional random fields

被引：15

作者：

Kim, Minyoung ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA

来源：

PATTERN RECOGNITION | 2010年 / 43卷 / 10期

关键词：

Conditional random fields; Cost-sensitive learning; CLASSIFICATION;

D O I：

10.1016/j.patcog.2010.05.013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We tackle the structured output classification problem using the Conditional Random Fields (CRFs). Unlike the standard 0/1 loss case, we consider a cost-sensitive learning setting where we are given a non-0/1 misclassification cost matrix at the individual output level. Although the task of cost-sensitive classification has many interesting practical applications that retain domain-specific scales in the output space (e.g., hierarchical or ordinal scale), most CRF learning algorithms are unable to effectively deal with the cost-sensitive scenarios as they merely assume a nominal scale (hence 0/1 loss) in the output space. In this paper, we incorporate the cost-sensitive loss into the large margin learning framework. By large margin learning, the proposed algorithm inherits most benefits from the SVM-like margin-based classifiers, such as the provable generalization error bounds. Moreover, the soft-max approximation employed in our approach yields a convex optimization similar to the standard CRF learning with only slight modification in the potential functions. We also provide the theoretical cost-sensitive generalization error bound. We demonstrate the improved prediction performance of the proposed method over the existing approaches in a diverse set of sequence/image structured prediction problems that often arise in pattern recognition and computer vision domains. (C) 2010 Elsevier Ltd. All rights reserved.

引用

页码：3683 / 3692

页数：10

共 30 条

[1]

ALTUN Y, 2003, INT C MACH LEARN WAS

[2]

[Anonymous], 2001, Journal of Machine Learning Research

[3]

[Anonymous], 2003, Exploring artificial intelligence in the new millennium, DOI DOI 10.5555/779343.779352

[4] The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].

Bartlett, PL .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536

[5] Semantic object classes in video: A high-definition ground truth database [J].

Brostow, Gabriel J. ;

Fauqueur, Julien ;

Cipolla, Roberto .

PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97

[6]

Brostow Gabriel J, 2008, ECCV, P44, DOI [DOI 10.1007/978-3-540-88682-2_5, DOI 10.1007/978-3-540-88682-2-5]

[7]

Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482

[8]

CHU W, 2005, INT C MACH LEARN BON

[9]

Domingos P., 1999, P ACM SIGKDD INT C K, P155, DOI DOI 10.1145/312129.312220

[10]

Elkan C., 2001, In Pro- ceedings of the Seventeenth International Joint Conference on Artificial Intelligence, P973, DOI DOI 10.5555/1642194.1642224

← 1 2 3 →