Latent Log-Linear Models for Handwritten Digit Classification

被引:13
作者
Deselaers, Thomas [1 ,5 ]
Gass, Tobias [2 ,5 ]
Heigold, Georg [3 ,5 ]
Ney, Hermann [4 ,5 ]
机构
[1] Google Switzerland, CH-8002 Zurich, Switzerland
[2] ETH, Comp Vis Lab, CH-8092 Zurich, Switzerland
[3] Google Inc, Mountain View, CA 94043 USA
[4] Rhein Westfal TH Aachen, Lehrstuhl Informat 6, D-52056 Aachen, Germany
[5] Univ Aachen, RWTH, Dept Comp Sci, Human Language Technol & Pattern Recognit Grp, Aachen, Germany
关键词
Log-linear models; latent variables; conditional random fields; OCR; image classification; RECOGNITION;
D O I
10.1109/TPAMI.2011.218
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.
引用
收藏
页码:1105 / 1117
页数:13
相关论文
共 38 条
[21]   Adaptation in statistical pattern recognition using tangent vectors [J].
Keysers, D ;
Macherey, W ;
Ney, G ;
Dahmen, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (02) :269-274
[22]   Deformation models for image recognition [J].
Keysers, Daniel ;
Deselaers, Thomas ;
Gollan, Christian ;
Ney, Hermann .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (08) :1422-1435
[23]  
Kullback S., 1971, ESTIMATING TES UNPUB
[24]  
Landauer Thomas K., 2007, Handbook of latent semantic analysis.
[25]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[26]  
LeCun Yann, 2011, MNIST DATABASE HANDW
[27]   ON THE LIMITED MEMORY BFGS METHOD FOR LARGE-SCALE OPTIMIZATION [J].
LIU, DC ;
NOCEDAL, J .
MATHEMATICAL PROGRAMMING, 1989, 45 (03) :503-528
[28]  
Minka T., 2004, COMP NUMERICAL OPTIM
[29]   RESEARCH ON MACHINE RECOGNITION OF HANDPRINTED CHARACTERS [J].
MORI, S ;
YAMAMOTO, K ;
YASUDA, M .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (04) :386-405
[30]  
Och FJ, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P295