FFTransMOT: Feature-Fused Transformer for Enhanced Multi-Object Tracking

被引：2

作者：

Hu, Xufeng ^{[1
]}

Jeon, Younghoon ^{[2
]}

Gwak, Jeonghwan ^{[1
,2
,3
,4
]}

机构：

[1] Korea Natl Univ Transportat, Dept IT Energy Convergence, Chungju 27469, South Korea

[2] Korea Natl Univ Transportat, Dept Software, Chungju 27469, South Korea

[3] Korea Natl Univ Transportat, Dept Biomed Engn, Chungju 27469, South Korea

[4] Korea Natl Univ Transportat, Dept AI Robot Engn, Chungju 27469, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Feature extraction; Transformers; Trajectory; Videos; Tracking; Decoding; Data models; Computer vision; Object tracking; feature fusion; multi-object tracking; object identification; OBJECT TRACKING;

D O I：

10.1109/ACCESS.2023.3327262

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of computer vision, multi-object tracking (MOT) is a crucial task. It involves the identification, tracking, and classification of multiple objects in videos, connecting their trajectories to form a complete motion sequence. MOT comprises two core components: object detection and data association. This entails detecting objects in each frame, determining the objects to be tracked, performing data association with the next frame, and predicting the future trajectories of the objects. In this paper, we propose a model named Feature-Fused Transformer for Enhanced Multi-object Tracking (FFTransMOT). In the FFTransMOT framework, a feature fusion module is integral to synthesizing a robust representation of object features by combining information from the current and previous frames. This fusion process strengthens the feature set, enhancing its reliability for the decoder's subsequent data association tasks. The decoder leverages these improved features to accurately match objects across frames, significantly enhancing the model's tracking capabilities over time. Subsequently, the decoder conducts data association matching between $\text{frame}_{t}$ and the newly fused features. Additionally, we employ a self-attention mechanism to capture dependencies between input features, thereby enhancing the accuracy and stability of object detection. To validate the performance of our proposed FFTransMOT model, we conducted rigorous evaluations on four datasets (MOT16, MOT17, DanceTrack, BDD 100k). The experimental results demonstrate that the FFTransMOT model outperforms other trackers in terms of tracking accuracy and robustness in MOT tasks.

引用

页码：130060 / 130071

页数：12

共 50 条

[31] Joint detection and embedding of multi-object tracking with feature decoupling
Laiwei Jiang
Ce Wang
Hongyu Yang
Signal, Image and Video Processing, 2025, 19 (8)
[32] STMT: Spatio-temporal memory transformer for multi-object tracking
Gu, Songbo
Ma, Jianxin
Hui, Guancheng
Xiao, Qiyang
Shi, Wentao
APPLIED INTELLIGENCE, 2023, 53 (20) : 23426 - 23441
[33] Transformer-Based Multi-object Tracking in Unmanned Aerial Vehicles
Li, Jiaxin
Li, Hongjun
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 347 - 358
[34] STMT: Spatio-temporal memory transformer for multi-object tracking
Songbo Gu
Jianxin Ma
Guancheng Hui
Qiyang Xiao
Wentao Shi
Applied Intelligence, 2023, 53 : 23426 - 23441
[35] Temporal-Spatial Feature Interaction Network for Multi-Drone Multi-Object Tracking
Wu, Han
Sun, Hao
Ji, Kefeng
Kuang, Gangyao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1165 - 1179
[36] Multi-object tracking with inter-feedback between detection and tracking
Tian, Shu
Yuan, Fei
Xia, Gui-Song
NEUROCOMPUTING, 2016, 171 : 768 - 780
[37] 3D LiDAR Multi-Object Tracking Using Multi Positive Contrastive Learning and Deep Reinforcement Learning
Cho, Minho
Kim, Euntai
IEEE ACCESS, 2025, 13 : 12447 - 12457
[38] Transformer-Based Band Regrouping With Feature Refinement for Hyperspectral Object Tracking
Wang, Hanzheng
Li, Wei
Xia, Xiang-Gen
Du, Qian
Tian, Jing
Shen, Qing
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[39] ETTrack: enhanced temporal motion predictor for multi-object tracking
Han, Xudong
Oishi, Nobuyuki
Tian, Yueying
Ucurum, Elif
Young, Rupert
Chatwin, Chris
Birch, Philip
APPLIED INTELLIGENCE, 2025, 55 (01)
[40] Dual-Stream Feature Fusion Network for Detection and ReID in Multi-object Tracking
He, Qingyou
Li, Liangqun
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 247 - 260

← 1 2 3 4 5 →