Graph Neural Networks for Cross-Camera Data Association

被引：5

作者：

Luna, Elena ^{[1
]}

SanMiguel, Juan C. ^{[1
]}

Martinez, Jose M. ^{[1
]}

Carballeira, Pablo ^{[1
]}

机构：

[1] Univ Autonoma Madrid, Video Proc & Understanding Lab, Madrid 28049, Spain

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 02期

关键词：

Cameras; Task analysis; Image edge detection; Three-dimensional displays; Message passing; Graph neural networks; Feature extraction; Data association; cross-camera; graph neural network; message passing network; MULTITARGET; TRACKING;

D O I：

10.1109/TCSVT.2022.3207223

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Cross-camera image data association is essential for many multi-camera computer vision tasks, such as multi-camera pedestrian detection, multi-camera multi-target tracking, 3D pose estimation, etc. This association task is typically modeled as a bipartite graph matching problem and often solved by applying minimum-cost flow techniques, which may be computationally demanding for large data. Furthermore, cameras are usually treated by pairs, obtaining local solutions, rather than finding a global solution at once for all multiple cameras. Other key issue is that of the affinity function: the widespread usage of non-learnable pre-defined distances, such as the Euclidean and Cosine ones. This paper proposes an effective approach for cross-camera data-association focused on a global solution, instead of processing cameras by pairs. To avoid the usage of fixed distances and thresholds, we leverage the connectivity of Graph Neural Networks, previously unused in this scope, using a Message Passing Network to jointly learn features and similarity functions. We validate the proposal for pedestrian cross-camera association, showing results over the EPFL multi-camera pedestrian dataset. Our approach considerably outperforms the literature data association techniques, without requiring to be trained in the same scenario in which it is tested.

引用

页码：589 / 601

页数：13

共 74 条

[1]

Aragüés R, 2011, ROBOTICS: SCIENCE AND SYSTEMS VI, P97

[2]

Banerjee A, 2005, J MACH LEARN RES, V6, P1345

[3]

Biggs N., 1986, GRAPH THEORY 1736 19, P1736

[4] Learning a Neural Solver for Multiple Object Tracking [J].

Braso, Guillem ;

Leal-Taixe, Laura .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6246-6256

[5] Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS [J].

Chen, Long ;

Ai, Haizhou ;

Chen, Rui ;

Zhuang, Zijie ;

Liu, Shuang .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :3276-3285

[6]

Chen Y., 2019, CVPR WORKSHOPS, V2, P324

[7] Online Multi-Object Tracking with Instance-Aware Tracker and Dynamic Model Refreshment [J].

Chu, Peng ;

Fan, Heng ;

Tan, Chiu C. ;

Ling, Haibin .

2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :161-170

[8] Batch DropBlock Network for Person Re-identification and Beyond [J].

Dai, Zuozhuo ;

Chen, Mingqiang ;

Gu, Xiaodong ;

Zhu, Siyu ;

Tan, Ping .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3690-3700

[9]

Davis J. V., 2007, P 24 INT C MACH LEAR, P209, DOI DOI 10.1145/1273496.1273523

[10] Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views [J].

Dong, Junting ;

Jiang, Wen ;

Huang, Qixing ;

Bao, Hujun ;

Zhou, Xiaowei .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7784-7793

← 1 2 3 4 5 6 7 8 →