Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

被引:35
作者
Lehmann, Alain [1 ]
Leibe, Bastian [2 ]
Van Gool, Luc [1 ,3 ]
机构
[1] Swiss Fed Inst Technol, Comp Vis Lab, Zurich, Switzerland
[2] Rhein Westfal TH Aachen, UMIC Res Ctr, Aachen, Germany
[3] Katholieke Univ Leuven, ESAT PSI IBBT, Louvain, Belgium
基金
瑞士国家科学基金会;
关键词
Object detection; Hough transform; Sliding-window; Branch and bound; Soft-matching; Spatial pyramid histograms; SCALE;
D O I
10.1007/s11263-010-0342-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the task of efficient object class detection by means of the Hough transform. This approach has been made popular by the Implicit Shape Model (ISM) and has been adopted many times. Although ISM exhibits robust detection performance, its probabilistic formulation is unsatisfactory. The PRincipled Implicit Shape Model (PRISM) overcomes these problems by interpreting Hough voting as a dual implementation of linear sliding-window detection. It thereby gives a sound justification to the voting procedure and imposes minimal constraints. We demonstrate PRISM's flexibility by two complementary implementations: a generatively trained Gaussian Mixture Model as well as a discriminatively trained histogram approach. Both systems achieve state-of-the-art performance. Detections are found by gradient-based or branch and bound search, respectively. The latter greatly benefits from PRISM's feature-centric view. It thereby avoids the unfavourable memory trade-off and any on-line pre-processing of the original Efficient Subwindow Search (ESS). Moreover, our approach takes account of the features' scale value while ESS does not. Finally, we show how to avoid soft-matching and spatial pyramid descriptors during detection without losing their positive effect. This makes algorithms simpler and faster. Both are possible if the object model is properly regularised and we discuss a modification of SVMs which allows for doing so.
引用
收藏
页码:175 / 197
页数:23
相关论文
共 50 条
[1]   Learning to detect objects in images via a sparse, part-based representation [J].
Agarwal, S ;
Awan, A ;
Roth, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (11) :1475-1490
[2]  
[Anonymous], 2009, P IEEE COMP VIS PATT
[3]  
[Anonymous], 2008, P EUR C COMP VIS
[4]  
[Anonymous], P IEEE C COMP VIS PA
[5]  
[Anonymous], 1994, Scale-Space Theory in Computer Vision
[6]  
[Anonymous], 2008, P IEEE C COMP VIS PA
[7]  
[Anonymous], P IEEE C COMP VIS PA
[8]  
[Anonymous], P IEEE INT C COMP VI
[9]  
[Anonymous], P IEEE C COMP VIS PA
[10]  
[Anonymous], P IEEE INT C COMP VI