Multiple object tracking based on appearance and motion graph convolutional neural networks with an explainer

被引：0

作者：

Zhang Y. ^{[1
]}

Huang Q. ^{[1
,2
]}

Zheng L. ^{[1
]}

机构：

[1] School of Computer Science and Technology, Harbin Engineering University, 145 Nantong Street, Heilongjiang, Harbin

[2] School of Computer Science and Technology, University of Chinese Academy of Sciences, Huairou District, Beijing

来源：

Neural Computing and Applications | 2024年 / 36卷 / 22期

基金：

中国国家自然科学基金;

关键词：

Explainer; Feature fusion; Graph neural networks; Multi-object tracking;

D O I：

10.1007/s00521-024-09773-0

中图分类号：

学科分类号：

摘要：

The tracking performance of Multi-Object Tracking (MOT) has recently been improved by using discriminative appearance and motion features. However, dense crowds and occlusions significantly reduce the reliability of these features, resulting in unsatisfied tracking performance. Thus, we design an end-to-end MOT model based on Graph Convolutional Neural Networks (GCNNs) which fuses four classes of features that characterize objects from their appearances, motions, appearance interactions, and motion interactions. Specifically, a Re-Identification (Re-ID) module is used to extract more discriminative appearance features. The appearance features from object tracklets are then averaged to simplify the proposed tracker. Then, we design two GCNNs to better distinguish objects. One is for extracting interactive appearance features, and the other is for interactive motion features. A fusion module then fuses those features, getting the global feature similarity based on which an association component calculates the MOT matching results. Finally, we semantically visualize relevant structures with the GNNExplainer for insight into the proposed tracker. The evaluation results on MOT16 and MOT17 benchmarks show that our model outperforms the state-of-the-art online tracking methods in terms of Multi-Object Tracking Accuracy and Identification F1 score which is consistent with the results from the GNNExplainer. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

引用

页码：13799 / 13814

页数：15

共 56 条

[1]

Yang K., Et al., Siamcorners: Siamese corner networks for visual tracking, IEEE Trans Multimed, 24, pp. 1956-1967, (2022)

[2]

Danelljan M., Et al., ECO: Efficient convolution operators for tracking, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, pp. 6931-6939, (2017)

[3]

Li B., Et al., High performance visual tracking with siamese region proposal network, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8971-8980, (2018)

[4]

Yuan D., Et al., Self-supervised deep correlation tracking, IEEE Trans Image Process, 30, pp. 976-985, (2021)

[5]

Yuan D., Et al., Robust thermal infrared tracking via an adaptively multi-feature fusion model, Neural Comput Appl, 35, pp. 3423-3434, (2023)

[6]

Ma C., Et al., Deep association: End-to-end graph-based learning for multiple object tracking with conv-graph neural network, Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 253-261, (2019)

[7]

Bewley A., Et al., Simple online and realtime tracking, 2016 IEEE international conference on image processing (ICIP), pp. 3464-3468, (2016)

[8]

Bochinski E., Et al., High-speed tracking-by-detection without using image information, 2017 14Th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS, pp. 1-6, (2017)

[9]

Yang N., Multi-object tracking with tracked object bounding box association, In: 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1-6, (2021)

[10]

Wang G., Et al., Track without appearance: learn box and tracklet embedding with local and global motion patterns for vehicle tracking, Proceedings of the IEEE/CVF International Conference on Computer Vision(Iccv), pp. 9856-9866, (2021)

← 1 2 3 4 5 6 →