Deep Learning-Based 3D Multi-Object Tracking Using Multimodal Fusion in Smart Cities

被引:1
作者
Li, Hui [1 ,2 ]
Liu, Xiang [1 ]
Jia, Hong [3 ]
Ahanger, Tariq Ahamed [4 ]
Xu, Lingwei [1 ,2 ,3 ]
Alzamil, Zamil [5 ]
Li, Xingwang [6 ]
机构
[1] Qingdao Univ Sci & Technol, Sch Informat Sci & Technol, Qingdao, Peoples R China
[2] Minist Educ, Engn Res Ctr Integrat & Applicat Digital Learning, Beijing, Peoples R China
[3] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart Cities, Xiamen, Peoples R China
[4] Prince Sattam bin Abdulaziz Univ, Coll Comp Engn & Sci, Alkharj, Saudi Arabia
[5] Majmaah Univ, Coll Comp & Informat Sci, Dept Comp Sci, Al Majmaah, Saudi Arabia
[6] Henan Polytech Univ, Sch Phys & Elect Informat Engn, Jiaozuo, Peoples R China
基金
中国国家自然科学基金;
关键词
Smart Cities; Visual Perception; 3D Multi-Object Tracking; Multimodal Feature Fusion; Position Affinity; Matrix; Data Association; OBJECT DETECTION; LIDAR;
D O I
10.22967/HCIS.2024.14.047
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The intelligent processing of visual perception information is one of the core technologies of smart cities. Deep learning-based 3D multi-object tracking is important in improving the intelligence and safety of robots in smart cities. However, 3D multi-object tracking still faces many challenges due to the complexity of the environment and uncertainty of the object. In this paper, we make the most of the multimodal information of image and point cloud and propose a multimodal adaptive feature gating fusion module to improve the feature fusion effect. In the object association stage, we designed an orientation-position-aware affinity matrix (EO-IoU) by using Euclidean distance, orientation similarity, and intersection over union, which is more suitable for the association to solve the problem of association failure when there is little or no overlap between the detection box and the prediction box. At the same time, we adopt a more robust two-stage data association method to solve the trajectory fragmentation and identity switching caused by discarding low-scoring detection boxes. The results of extensive experiments on the KITTI and NuScenes benchmark datasets demonstrate that our method outperforms existing state-of-the-art methods with better robustness and accuracy.
引用
收藏
页数:19
相关论文
共 43 条
[31]   Track without Appearance: Learn Box and Tracklet Embedding with Local and Global Motion Patterns for Vehicle Tracking [J].
Wang, Gaoang ;
Gu, Renshu ;
Liu, Zuozhu ;
Hu, Weijie ;
Song, Mingli ;
Hwang, Jenq-Neng .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9856-9866
[32]   PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds [J].
Wang, Sukai ;
Sun, Yuxiang ;
Liu, Chengju ;
Liu, Ming .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :3206-3212
[33]   DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion With Deep Association [J].
Wang, Xiyang ;
Fu, Chunyun ;
Li, Zhankun ;
Lai, Ying ;
He, Jiawei .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) :8260-8267
[34]   3D Multi-Object Tracking: A Baseline and New Evaluation Metrics [J].
Weng, Xinshuo ;
Wang, Jianren ;
Held, David ;
Kitani, Kris .
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, :10359-10366
[35]   GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with 2D-3D Multi-Feature Learning [J].
Weng, Xinshuo ;
Wang, Yongxin ;
Man, Yunze ;
Kitani, Kris M. .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6498-6507
[36]   3D Multi-Object Tracking in Point Clouds Based on Prediction Confidence-Guided Data Association [J].
Wu, Hai ;
Han, Wenkai ;
Wen, Chenglu ;
Li, Xin ;
Wang, Cheng .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (06) :5668-5677
[37]   Track to Detect and Segment: An Online Multi-Object Tracker [J].
Wu, Jialian ;
Cao, Jiale ;
Song, Liangchen ;
Wang, Yu ;
Yang, Ming ;
Yuan, Junsong .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12347-12356
[38]   Center-based 3D Object Detection and Tracking [J].
Yin, Tianwei ;
Zhou, Xingyi ;
Krahenbuhl, Philipp .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :11779-11788
[39]   Cross-Modal 3D Object Detection and Tracking for Auto-Driving [J].
Zeng, Yihan ;
Ma, Chao ;
Zhu, Ming ;
Fan, Zhiming ;
Yang, Xiaokang .
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, :3850-3857
[40]   Robust Multi-Modality Multi-Object Tracking [J].
Zhang, Wenwei ;
Zhou, Hui ;
Sun, Shuyang ;
Wang, Zhe ;
Shi, Jianping ;
Loy, Chen Change .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2365-2374