Modeling inter-camera space-time and appearance relationships for tracking across non-overlapping views

被引：189

作者：

Javed, Omar ^{[1
]}

Shafique, Khurram ^{[2
]}

Rasheed, Zeeshan ^{[1
]}

Shah, Mubarak ^{[2
]}

机构：

[1] Object Video, Reston, VA 20171 USA

[2] Univ Cent Florida, Orlando, FL 32816 USA

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2008年 / 109卷 / 02期

关键词：

multi-camera appearance models; non-overlapping cameras; scene analysis; multi-camera tracking; surveillance;

D O I：

10.1016/j.cviu.2007.01.003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Tracking across cameras with non-overlapping views is a challenging problem. Firstly, the observations of an object are often widely separated in time and space when viewed from non-overlapping cameras. Secondly, the appearance of an object in one camera view might be very different from its appearance in another camera view due to the differences in illumination, pose and camera properties. To deal with the first problem, we observe that people or vehicles tend to follow the same paths in most cases, i.e., roads, walkways, corridors etc. The proposed algorithm uses this conformity in the traversed paths to establish correspondence. The algorithm learns this conformity and hence the inter-camera relationships in the form of multivariate probability density of space-time variables (entry and exit locations, velocities, and transition times) using kernel density estimation. To handle the appearance change of an object as it moves from one camera to another, we show that all brightness transfer functions from a given camera to another camera lie in a low dimensional subspace. This subspace is learned by using probabilistic principal component analysis and used for appearance matching. The proposed approach does not require explicit inter-camera calibration, rather the system learns the camera topology and subspace of inter-camera brightness transfer functions during a training phase. Once the training is complete, correspondences are assigned using the maximum likelihood (ML) estimation framework using both location and appearance cues. Experiments with real world videos are reported which validate the proposed approach. (C) 2007 Elsevier Inc. All rights reserved.

引用

页码：146 / 162

页数：17

共 35 条

[1] Tracking human motion in structured environments using a distributed-camera system [J].