A cross-modal pedestrian Re-ID algorithm based on dual attribute information

被引：0

作者：

Chen L. ^{[1
]}

Gao Z. ^{[2
]}

Song X. ^{[1
]}

Wang Y. ^{[2
]}

Nie L. ^{[1
]}

机构：

[1] School of Computer Science and Technology, Shandong University, Qingdao

[2] Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan

来源：

Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics | 2022年 / 48卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Cross-modal retrieval; Feature fusion; Feature representation; Matching algorithm; Pedestrian attribute information;

D O I：

10.13700/j.bh.1001-5965.2020.0614

中图分类号：

G252.7 [文献检索]; G354 [情报检索];

学科分类号：

摘要：

Through the investigation of cross-modal retrieval, the use of attribute information can enhance the semantic representation of extracted features. The attributes of the pedestrian image and text are not used adequately in the existing cross-modal pedestrian Re-ID algorithms based on natural language. To tackle the above issues, a novel cross-modal pedestrian Re-ID algorithm based on dual attribute information is proposed. Specifically, the attribute information of the pedestrian image and the attribute information of pedestrian text descriptions are fully and simultaneously explored, and the dual attribute space is also built to improve the distinguishability and semantic expression of extracted image and text features. Extensive experimental results on a public cross-modal pedestrian Re-ID dataset CUHK-PEDES demonstrate that the proposed algorithm is comparable with state-of-the-art algorithm CMAAM (Top-1 56.68%), the retrieval accuracy Top-1 of the proposed algorithm reaches 56.42%, and Top-5 and Top-10 are improved by 0.45% and 0.29% respectively. Besides, the retrieval accuracy of cross-modal pedestrian images can be significantly improved if the class information is provided in the gallery image pool and is used to extract attribute features, and Top-1 can reach 64.88%. The ablation study also proves the importance of the text attribute and image attribute used by the proposed algorithm and the effectiveness of the dual attribute space. © 2022, Editorial Board of JBUAA. All right reserved.

引用

页码：647 / 656

页数：9

共 33 条

[1] ZHENG L, YANG Y, HAUPTMANN A G., Person re-identification:Past, present and future
[2] LUO H, JIANG W, FAN X, Et al., A survey on deep learning based person re-identification, Acta Automatica Sinica, 45, 11, pp. 2032-2049, (2019)
[3] YE M, SHEN J, LIN G, Et al., Deep learning for person re-identification:A survey and outlook
[4] LI S, XIAO T, LI H, Et al., Person search with natural language description, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1970-1979, (2017)
[5] JI G, LI S J, PANG Y., Fusion-attention network for person search with free-form natural language[J], Pattern Recognition Letters, 116, pp. 205-211, (2018)
[6] CHEN T, XU C, LUO J., Improving text-based person search by spatial matching and adaptive threshold, Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 1879-1887, (2018)
[7] CHEN D, LI H, LIU X, Et al., Improving deep visual representation for person re-identification by global and local image-language association[C], Proceedings of the European Conference on Computer Vision, pp. 56-73, (2018)
[8] WANG Y, BO C, WANG D, Et al., Language person search with mutually connected classification loss, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2057-2061, (2019)
[9] ZHANG Y, LU H., Deep cross-modal projection learning for image-text matching, Proceedings of the European Conference on Computer Vision, pp. 707-723, (2018)
[10] AGGARWAL S, BABU R V, CHAKRABORTY A., Text-based person search via attribute-aided matching, Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 2617-2625, (2020)

← 1 2 3 4 →