Transformer-based neural architecture search for effective visible-infrared person re-identification

被引:0
|
作者
Sarker, Prodip Kumar [1 ]
机构
[1] Begum Rokeya Univ, Dept Comp Sci & Engn, Rangpur 5400, Bangladesh
关键词
Transformer; Neural architecture search; Attention mechanism; Feature extraction; Cross-modality;
D O I
10.1016/j.neucom.2024.129257
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visible-infrared person re-identification (VI-reID) is a complex task insecurity and video surveillance that aims to identify and match a person captured by various non-overlapping cameras. In recent years, there has been a notable advancement in reID owing to the development of transformer-based architectures. Although many existing methods emphasize on learning both modality-specific and shared features, challenges remain in fully exploiting the complementary information between infrared and visible modalities. Consequently, there is still opportunity to increase retrieval performance by effectively comprehending and integrating cross- modality semantic information. These designs often have problems with model complexity and time-consuming processes. To tackle these issues, we employ a novel transformer-based neural architecture search (TNAS) deep learning approach for effective VI-reID. To alleviate modality gaps, we first introduce a global-local transformer (GLT) module that captures features at both global and local levels across different modalities, contributing to better feature representation and matching. Then, an efficient neural architecture search (NAS) module is developed to search for the optimal transformer-based architecture, which further enhances the performance of VI-reID. Additionally, we introduce distillation loss and modality discriminative (MD) loss to examine the potential consistency between different modalities to promote intermodality separation between classes and intramodality compactness within classes. Experimental results on two challenging benchmark datasets illustrate that our developed model achieves state-of-the-art results, outperforming existing VI-reID methods.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] A cross-modality person re-identification method for visible-infrared images
    Sun Y.
    Wang R.
    Zhang Q.
    Lin R.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (06): : 2018 - 2025
  • [42] Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification
    Feng, Zhanxiang
    Lai, Jianhuang
    Xie, Xiaohua
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 579 - 590
  • [43] Unbiased Feature Learning with Causal Intervention for Visible-Infrared Person Re-Identification
    Yuan, Bo wen
    Lu, Jiahao
    You, Sisi
    Bao, Bing-kun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (10)
  • [44] Adaptive Middle Modality Alignment Learning for Visible-Infrared Person Re-identification
    Zhang, Yukang
    Yan, Yan
    Lu, Yang
    Wang, Hanzi
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 2176 - 2196
  • [45] Learning dual attention enhancement feature for visible-infrared person re-identification
    Zhang, Guoqing
    Zhang, Yinyin
    Zhang, Hongwei
    Chen, Yuhao
    Zheng, Yuhui
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 99
  • [46] Multi-Stage Auxiliary Learning for Visible-Infrared Person Re-Identification
    Zhang, Huadong
    Cheng, Shuli
    Du, Anyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 12032 - 12047
  • [47] An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality
    Tao, Haiyun
    Zhang, Yukang
    Lu, Yang
    Wang, Hanzi
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 120 - 132
  • [48] A Three-Stage Framework for Video-Based Visible-Infrared Person Re-Identification
    Hou, Wei
    Wang, Wenxuan
    Yan, Yiming
    Wu, Di
    Xia, Qingyu
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1254 - 1258
  • [49] Hierarchical disturbance and Group Inference for video-based visible-infrared person re-identification
    Zhou, Chuhao
    Zhou, Yuzhe
    Ren, Tingting
    Li, Huafeng
    Li, Jinxing
    Lu, Guangming
    INFORMATION FUSION, 2025, 117
  • [50] MDANet: Modality-Aware Domain Alignment Network for Visible-Infrared Person Re-Identification
    Cheng, Xu
    Yu, Hao
    Cheng, Kevin Ho Man
    Yu, Zitong
    Zhao, Guoying
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 2015 - 2027