Dual-Space Video Person Re-identification

被引:0
作者
Leng, Jiaxu [1 ,2 ]
Kuang, Changjiang [1 ,2 ]
Li, Shuang [1 ,2 ]
Gan, Ji [1 ,2 ]
Chen, Haosheng [1 ,2 ]
Gao, Xinbo [1 ,2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Key Lab Image Cognit, Chongqing 400065, Peoples R China
[2] Chongqing Inst Brain & Intelligence, Guangyang Bay Lab, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
Video person re-identification; Hyperbolic space; Graph construction; Hierarchical relationships; Dual-space representation;
D O I
10.1007/s11263-025-02350-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video person re-identification (VReID) aims to recognize individuals across video sequences. Existing methods primarily use Euclidean space for representation learning but struggle to capture complex hierarchical structures, especially in scenarios with occlusions and background clutter. In contrast, hyperbolic space, with its negatively curved geometry, excels at preserving hierarchical relationships and enhancing discrimination between similar appearances. Inspired by these, we propose Dual-Space Video Person Re-Identification (DS-VReID) to utilize the strength of both Euclidean and hyperbolic geometries, capturing the visual features while also exploring the intrinsic hierarchical relations, thereby enhancing the discriminative capacity of the features. Specifically, we design the Dynamic Prompt Graph Construction (DPGC) module, which uses a pre-trained CLIP model with learnable dynamic prompts to construct 3D graphs that capture subtle changes and dynamic information in video sequences. Building upon this, we introduce the Hyperbolic Disentangled Aggregation (HDA) module, which addresses long-range dependency modeling by decoupling node distances and integrating adjacency matrices, capturing detailed spatial-temporal hierarchical relationships. Extensive experiments on benchmark datasets demonstrate the superiority of DS-VReID over state-of-the-art methods, showcasing its potential in complex VReID scenarios.
引用
收藏
页码:3667 / 3688
页数:22
相关论文
共 74 条
[1]  
Bachmann G, 2020, PR MACH LEARN RES, V119
[2]   Salient-to-Broad Transition for Video Person Re-identification [J].
Bai, Shutao ;
Ma, Bingpeng ;
Chang, Hong ;
Huang, Rui ;
Chen, Xilin .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :7329-7338
[3]  
Chami I, 2019, ADV NEUR IN, V32
[4]   HASI: Hierarchical Attention-Aware Spatio-Temporal Interaction for Video-Based Person Re-Identification [J].
Chen, Si ;
Da, Hui ;
Wang, Da-Han ;
Zhang, Xu-Yao ;
Yan, Yan ;
Zhu, Shunzhi .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) :4973-4988
[5]  
Chen ZQ, 2020, AAAI CONF ARTIF INTE, V34, P10591
[6]   Video-based Person Re-identification with Spatial and Temporal Memory Networks [J].
Eom, Chanho ;
Lee, Geon ;
Lee, Junghyup ;
Ham, Bumsub .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :12016-12025
[7]   Hyperbolic Vision Transformers: Combining Improvements in Metric Learning [J].
Ermolov, Aleksandr ;
Mirvakhabova, Leyla ;
Khrulkov, Valentin ;
Sebe, Nicu ;
Oseledets, Ivan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :7399-7409
[8]  
Fu Y, 2019, AAAI CONF ARTIF INTE, P8287
[9]  
Ganea OE, 2018, ADV NEUR IN, V31
[10]   Motion Feature Aggregation for Video-Based Person Re-Identification [J].
Gu, Xinqian ;
Chang, Hong ;
Ma, Bingpeng ;
Shan, Shiguang .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :3908-3919