FFTransMOT: Feature-Fused Transformer for Enhanced Multi-Object Tracking

被引:2
|
作者
Hu, Xufeng [1 ]
Jeon, Younghoon [2 ]
Gwak, Jeonghwan [1 ,2 ,3 ,4 ]
机构
[1] Korea Natl Univ Transportat, Dept IT Energy Convergence, Chungju 27469, South Korea
[2] Korea Natl Univ Transportat, Dept Software, Chungju 27469, South Korea
[3] Korea Natl Univ Transportat, Dept Biomed Engn, Chungju 27469, South Korea
[4] Korea Natl Univ Transportat, Dept AI Robot Engn, Chungju 27469, South Korea
关键词
Feature extraction; Transformers; Trajectory; Videos; Tracking; Decoding; Data models; Computer vision; Object tracking; feature fusion; multi-object tracking; object identification; OBJECT TRACKING;
D O I
10.1109/ACCESS.2023.3327262
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of computer vision, multi-object tracking (MOT) is a crucial task. It involves the identification, tracking, and classification of multiple objects in videos, connecting their trajectories to form a complete motion sequence. MOT comprises two core components: object detection and data association. This entails detecting objects in each frame, determining the objects to be tracked, performing data association with the next frame, and predicting the future trajectories of the objects. In this paper, we propose a model named Feature-Fused Transformer for Enhanced Multi-object Tracking (FFTransMOT). In the FFTransMOT framework, a feature fusion module is integral to synthesizing a robust representation of object features by combining information from the current and previous frames. This fusion process strengthens the feature set, enhancing its reliability for the decoder's subsequent data association tasks. The decoder leverages these improved features to accurately match objects across frames, significantly enhancing the model's tracking capabilities over time. Subsequently, the decoder conducts data association matching between $\text{frame}_{t}$ and the newly fused features. Additionally, we employ a self-attention mechanism to capture dependencies between input features, thereby enhancing the accuracy and stability of object detection. To validate the performance of our proposed FFTransMOT model, we conducted rigorous evaluations on four datasets (MOT16, MOT17, DanceTrack, BDD 100k). The experimental results demonstrate that the FFTransMOT model outperforms other trackers in terms of tracking accuracy and robustness in MOT tasks.
引用
收藏
页码:130060 / 130071
页数:12
相关论文
共 50 条
  • [31] Joint detection and embedding of multi-object tracking with feature decoupling
    Laiwei Jiang
    Ce Wang
    Hongyu Yang
    Signal, Image and Video Processing, 2025, 19 (8)
  • [32] STMT: Spatio-temporal memory transformer for multi-object tracking
    Gu, Songbo
    Ma, Jianxin
    Hui, Guancheng
    Xiao, Qiyang
    Shi, Wentao
    APPLIED INTELLIGENCE, 2023, 53 (20) : 23426 - 23441
  • [33] Transformer-Based Multi-object Tracking in Unmanned Aerial Vehicles
    Li, Jiaxin
    Li, Hongjun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 347 - 358
  • [34] STMT: Spatio-temporal memory transformer for multi-object tracking
    Songbo Gu
    Jianxin Ma
    Guancheng Hui
    Qiyang Xiao
    Wentao Shi
    Applied Intelligence, 2023, 53 : 23426 - 23441
  • [35] Temporal-Spatial Feature Interaction Network for Multi-Drone Multi-Object Tracking
    Wu, Han
    Sun, Hao
    Ji, Kefeng
    Kuang, Gangyao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1165 - 1179
  • [36] Multi-object tracking with inter-feedback between detection and tracking
    Tian, Shu
    Yuan, Fei
    Xia, Gui-Song
    NEUROCOMPUTING, 2016, 171 : 768 - 780
  • [37] 3D LiDAR Multi-Object Tracking Using Multi Positive Contrastive Learning and Deep Reinforcement Learning
    Cho, Minho
    Kim, Euntai
    IEEE ACCESS, 2025, 13 : 12447 - 12457
  • [38] Transformer-Based Band Regrouping With Feature Refinement for Hyperspectral Object Tracking
    Wang, Hanzheng
    Li, Wei
    Xia, Xiang-Gen
    Du, Qian
    Tian, Jing
    Shen, Qing
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [39] ETTrack: enhanced temporal motion predictor for multi-object tracking
    Han, Xudong
    Oishi, Nobuyuki
    Tian, Yueying
    Ucurum, Elif
    Young, Rupert
    Chatwin, Chris
    Birch, Philip
    APPLIED INTELLIGENCE, 2025, 55 (01)
  • [40] Dual-Stream Feature Fusion Network for Detection and ReID in Multi-object Tracking
    He, Qingyou
    Li, Liangqun
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 247 - 260