Interpretation of complex scenes using dynamic tree-structure Bayesian networks

被引：7

作者：

Todorovic, Sinisa ^{[1
]}

Nechyba, Michael C.

机构：

[1] Univ Illinois, Comp Vis & Robot Lab, Beckman Inst Adv Sci & Technol, Urbana, IL 61801 USA

[2] Pittsburgh Pattern Recognit Inc, Pittsburgh, PA 15222 USA

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2007年 / 106卷 / 01期

关键词：

generative models; Bayesian networks; dynamic trees; variational inference; image segmentation; object recognition;

D O I：

10.1016/j.cviu.2005.09.005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper addresses the problem of object detection and recognition in complex scenes, where objects are partially occluded. The approach presented herein is based on the hypothesis that a careful analysis of visible object details at various scales is critical for recognition in such settings. In general, however, computational complexity becomes prohibitive when trying to analyze multiple sub-parts of multiple objects in an image. To alleviate this problem, we propose a generative-model framework-namely, dynamic tree-structure belief networks (DTSBNs). This framework formulates object detection and recognition as inference of DTSBN structure and image-class conditional distributions, given an image. The causal (Markovian) dependencies in DTSBNs allow for design of computationally efficient inference, as well as for interpretation of the estimated structure as follows: each root represents a whole distinct object, while children nodes down the sub-tree represent parts of that object at various scales. Therefore, within the DTSBN framework, the treatment and recognition of object parts requires no additional training, but merely a particular interpretation of the tree/subtree structure. This property leads to a strategy for recognition of objects as a whole through recognition of their visible parts. Our experimental results demonstrate that this approach remarkably outperforms strategies without explicit analysis of object parts. (c) 2006 Elsevier Inc. All rights reserved.

引用

页码：71 / 84

页数：14

共 34 条

[1] Adams NJ, 2000, INT C PATT RECOG, P147, DOI 10.1109/ICPR.2000.903506
[2] AITKIN M, 1985, J ROY STAT SOC B MET, V47, P67
[3] Shape matching and object recognition using shape contexts
Belongie, S
Malik, J
Puzicha, J
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) : 509 - 522
[4] BHANU GJB, 1999, IEEE T PATTERN ANAL, V21, P603
[5] A MULTISCALE RANDOM-FIELD MODEL FOR BAYESIAN IMAGE SEGMENTATION
BOUMAN, CA
SHAPIRO, M
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 1994, 3 (02) : 162 - 177
[6] Multiscale Bayesian segmentation using a trainable context model
Cheng, H
Bouman, CA
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (04) : 511 - 525
[7] Multiscale image segmentation using wavelet-domain hidden Markov models
Choi, H
Baraniuk, RG
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (09) : 1309 - 1321
[8] Cover TM, 2006, Elements of Information Theory
[9] Reconstruction of planar surfaces behind occlusions in range images
Dell'Acqua, F
Fisher, R
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) : 569 - 575
[10] Combining belief networks and neural networks for scene segmentation
Feng, XJ
Williams, CKI
Felderhof, SN
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) : 467 - 483

← 1 2 3 4 →