Seeing the Objects Behind the Dots: Recognition in Videos from a Moving Camera

被引:0
作者
Björn Ommer
Theodor Mader
Joachim M. Buhmann
机构
[1] University of California,Department of EECS
[2] ETH Zurich,Department of Computer Science
来源
International Journal of Computer Vision | 2009年 / 83卷
关键词
Object recognition; Segmentation; Tracking; Video analysis; Compositionality; Visual learning;
D O I
暂无
中图分类号
学科分类号
摘要
Category-level object recognition, segmentation, and tracking in videos becomes highly challenging when applied to sequences from a hand-held camera that features extensive motion and zooming. An additional challenge is then to develop a fully automatic video analysis system that works without manual initialization of a tracker or other human intervention, both during training and during recognition, despite background clutter and other distracting objects. Moreover, our working hypothesis states that category-level recognition is possible based only on an erratic, flickering pattern of interest point locations without extracting additional features. Compositions of these points are then tracked individually by estimating a parametric motion model. Groups of compositions segment a video frame into the various objects that are present and into background clutter. Objects can then be recognized and tracked based on the motion of their compositions and on the shape they form. Finally, the combination of this flow-based representation with an appearance-based one is investigated. Besides evaluating the approach on a challenging video categorization database with significant camera motion and clutter, we also demonstrate that it generalizes to action recognition in a natural way.
引用
收藏
页码:57 / 71
页数:14
相关论文
共 22 条
  • [1] Comaniciu D.(2003)Kernel-based object tracking IEEE Transactions on Pattern Analysis and Machine Intelligence 25 564-575
  • [2] Ramesh V.(2005)Pictorial structures for object recognition International Journal of Computer Vision 61 55-79
  • [3] Meer P.(2006)Context-based segmentation of image sequences IEEE Transactions on Pattern Analysis and Machine Intelligence 28 463-468
  • [4] Felzenszwalb P. F.(1994)Computing occluding and transparent motions International Journal of Computer Vision 12 5-16
  • [5] Huttenlocher D. P.(2008)Learning layered motion segmentations of video International Journal of Computer Vision 76 301-319
  • [6] Goldberger J.(2004)Distinctive image features from scale-invariant keypoints International Journal of Computer Vision 60 91-110
  • [7] Greenspann H.(2002)Detecting lameness using ‘re-sampling condensation’ and ‘multi-stream cyclic hidden Markov models’ Image and Vision Computing 20 581-594
  • [8] Irani M.(1963)An algorithm for least-squares estimation of nonlinear parameters Journal of the Society for Industrial and Applied Mathematics 11 431-441
  • [9] Rousso B.(2006)Object level grouping for video shots International Journal of Computer Vision 67 189-210
  • [10] Peleg S.(1994)Representing moving images with layers IEEE Transactions on Image Processing 3 625-638