CM-DASN: visible-infrared cross-modality person re-identification via dynamic attention selection network

被引:0
作者
Li, Yuxin [1 ]
Lu, Hu [1 ]
Qin, Tingting [1 ]
Tu, Juanjuan [2 ]
Wu, Shengli [3 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Jiangsu Univ Sci & Technol, Sch Comp, Zhenjiang 212100, Jiangsu, Peoples R China
[3] Ulster Univ, Sch Comp, Belfast BT15 1ED, North Ireland
关键词
Person re-identification; Visible-infrared; Cross-modality; Vision transformer;
D O I
10.1007/s00530-025-01724-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-modality person re-identification between RGB and IR images presents significant challenges due to substantial modality discrepancies. While existing approaches often focus on learning either modality-specific or modality-shared features, overemphasis on the former may hinder cross-modality matching, whereas the latter are more beneficial for this task. To address this challenge, we propose CM-DASN (Cross-Modality Dynamic Attention Selection Network), a novel approach based on dynamic attention optimization. The core of our method is the Dynamic Attention Selection Module (DASM), which adaptively selects the most effective combination of attention heads in the later stages of training, thereby balancing the learning of modality-shared and modality-specific features. We employ a softmax score-based feature selection mechanism to extract and enhance the most discriminative cross-modality feature representations. By alternating supervised learning of high-scoring modality-shared and modality-specific features in the later training stages, the model focuses on learning highly discriminative modality-shared features while retaining beneficial modality-specific information. Furthermore, we design a multi-stage, multi-scale cross-modality feature alignment strategy to more effectively learn cross-modality representations by aligning features of different scales in a phased, progressive manner. This approach considers both global structure and local details, thereby improving cross-modality person re-identification performance. Our method achieves higher cross-modality matching accuracy with minimal increases in model parameters and computational time. Extensive experiments on the SYSU-MM01 and RegDB datasets validate the effectiveness of our proposed framework, demonstrating that it outperforms most existing state-of-the-art approaches in terms of performance. The source code is available at https://github.com/hulu88/CM_DASN.
引用
收藏
页数:14
相关论文
共 49 条
  • [1] Dual-Stream Transformer With Distribution Alignment for Visible-Infrared Person Re-Identification
    Chai, Zehua
    Ling, Yongguo
    Luo, Zhiming
    Lin, Dazhen
    Jiang, Min
    Li, Shaozi
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6764 - 6776
  • [2] Structure-Aware Positional Transformer for Visible-Infrared Person Re-Identification
    Chen, Cuiqun
    Ye, Mang
    Qi, Meibin
    Wu, Jingjing
    Jiang, Jianguo
    Lin, Chia-Wen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2352 - 2364
  • [3] Learning shared features from specific and ambiguous descriptions for text-based person search
    Cheng, Ke
    Geng, Qikai
    Huang, Shucheng
    Tu, Juanjuan
    Lu, Hu
    [J]. MULTIMEDIA SYSTEMS, 2024, 30 (02)
  • [4] Dai PY, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P677
  • [5] Dosovitskiy Alexey, 2020, COMPUTER VISION PATT
  • [6] Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification
    Feng, Jiawei
    Wu, Ancong
    Zhen, Wei-Shi
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22752 - 22761
  • [7] CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification
    Fu, Chaoyou
    Hu, Yibo
    Wu, Xiang
    Shi, Hailin
    Mei, Tao
    He, Ran
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11803 - 11812
  • [8] TransReID: Transformer-based Object Re-Identification
    He, Shuting
    Luo, Hao
    Wang, Pichao
    Wang, Fan
    Li, Hao
    Jiang, Wei
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14993 - 15002
  • [9] Hu J, 2018, P IEEE C COMP VIS PA, P7132
  • [10] MSCMNet: Multi-scale Semantic Correlation Mining for Visible-Infrared Person Re-Identification
    Hua, Xuecheng
    Cheng, Ke
    Lu, Hu
    Tu, Juanjuan
    Wang, Yuanquan
    Wang, Shitong
    [J]. PATTERN RECOGNITION, 2025, 159