Learning Active Basis Model for Object Detection and Recognition

被引:0
作者
Ying Nian Wu
Zhangzhang Si
Haifeng Gong
Song-Chun Zhu
机构
[1] University of California,Department of Statistics
[2] Lotus Hill Research Institute,undefined
来源
International Journal of Computer Vision | 2010年 / 90卷
关键词
Deformable template; Generative model; Shared sketch algorithm; Sum maps and max maps; Wavelet sparse coding;
D O I
暂无
中图分类号
学科分类号
摘要
This article proposes an active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates. In our generative model, a deformable template is in the form of an active basis, which consists of a small number of Gabor wavelet elements at selected locations and orientations. These elements are allowed to slightly perturb their locations and orientations before they are linearly combined to generate the observed image. The active basis model, in particular, the locations and the orientations of the basis elements, can be learned from training images by the shared sketch algorithm. The algorithm selects the elements of the active basis sequentially from a dictionary of Gabor wavelets. When an element is selected at each step, the element is shared by all the training images, and the element is perturbed to encode or sketch a nearby edge segment in each training image. The recognition of the deformable template from an image can be accomplished by a computational architecture that alternates the sum maps and the max maps. The computation of the max maps deforms the active basis to match the image data, and the computation of the sum maps scores the template matching by the log-likelihood of the deformed active basis.
引用
收藏
页码:198 / 235
页数:37
相关论文
共 57 条
  • [1] Amit Y.(2007)Pop: Patchwork of parts models for object recognition International Journal of Computer Vision 75 267-282
  • [2] Trouve A.(2001)Active appearance models IEEE Transactions on Pattern Analysis and Machine Intelligence 23 681-685
  • [3] Cootes T. F.(1985)Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters Journal of Optical Society of America 2 1160-1169
  • [4] Edwards G. J.(1977)Maximum likelihood from incomplete data via the EM algorithm Journal of the Royal Statistical Society, B 39 1-38
  • [5] Taylor C. J.(1997)A decision-theoretic generalization of on-line learning and an application to boosting Journal of Computer and System Sciences 55 119-139
  • [6] Daugman J.(1987)Exploratory projection pursuit Journal of the American Statistical Association 82 249-266
  • [7] Dempster A. P.(2002)Composition systems Quarterly of Applied Mathematics 60 707-736
  • [8] Laird N. M.(1988)Snakes: active contour models International Journal of Computer Vision 1 321-331
  • [9] Rubin D. B.(1993)Distortion invariant object recognition in the dynamic link architecture IEEE Transactions on Computers 42 300-311
  • [10] Freund Y.(1998)Gradient-based learning applied to document recognition Proceedings of the IEEE 86 2278-2324