Attention in Attention Networks for Person Retrieval

被引：30

作者：

Fang, Pengfei ^{[1
,2
]}

Zhou, Jieming ^{[1
,2
]}

Roy, Soumava Kumar ^{[1
,2
]}

Ji, Pan ^{[3
]}

Petersson, Lars ^{[1
,2
]}

Harandi, Mehrtash ^{[4
,5
]}

机构：

[1] Australian Natl Univ, Res Sch Elect Energy & Mat Engn, Canberra, ACT 2601, Australia

[2] Black Mt Labs, Data61 CSIRO, Canberra, ACT 2601, Australia

[3] OPPO US Res Ctr, Palo Alto, CA 94303 USA

[4] Monash Univ, Dept Elect & Comp Syst Engn, Clayton, Vic 3800, Australia

[5] Data61 CSIRO, Melbourne, Vic 3800, Australia

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 09期

关键词：

Kernel; Task analysis; Visualization; Estimation; Benchmark testing; Training; Feature extraction; Attention in attention mechanism; person retrieval; pedestrian representation; convolutional neural network; second-order polynomial kernel; Gaussian kernel; REIDENTIFICATION;

D O I：

10.1109/TPAMI.2021.3073512

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper generalizes the Attention in Attention (AiA) mechanism, in P. Fang et al., 2019 by employing explicit mapping in reproducing kernel Hilbert spaces to generate attention values of the input feature map. The AiA mechanism models the capacity of building inter-dependencies among the local and global features by the interaction of inner and outer attention modules. Besides a vanilla AiA module, termed linear attention with AiA, two non-linear counterparts, namely, second-order polynomial attention and Gaussian attention, are also proposed to utilize the non-linear properties of the input features explicitly, via the second-order polynomial kernel and Gaussian kernel approximation. The deep convolutional neural network, equipped with the proposed AiA blocks, is referred to as Attention in Attention Network (AiA-Net). The AiA-Net learns to extract a discriminative pedestrian representation, which combines complementary person appearance and corresponding part features. Extensive ablation studies verify the effectiveness of the AiA mechanism and the use of non-linear features hidden in the feature map for attention design. Furthermore, our approach outperforms current state-of-the-art by a considerable margin across a number of benchmarks. In addition, state-of-the-art performance is also achieved in the video person retrieval task with the assistance of the proposed AiA blocks.

引用

页码：4626 / 4641

页数：16

共 96 条

[1]

[Anonymous], PROC 20 ACM SIGKDD I

[2] Scalable Person Re-identification on Supervised Smoothed Manifold [J].

Bai, Song ;

Bai, Xiang ;

Tian, Qi .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3356-3365

[3] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

[4] Multi-Level Factorisation Net for Person Re-Identification [J].

Chang, Xiaobin ;

Hospedales, Timothy M. ;

Xiang, Tao .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2109-2118

[5] ABD-Net: Attentive but Diverse Person Re-Identification [J].

Chen, Tianlong ;

Ding, Shaojin ;

Xie, Jingyi ;

Yuan, Ye ;

Chen, Wuyang ;

Yang, Yang ;

Ren, Zhou ;

Wang, Zhangyang .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8350-8360

[6] Beyond triplet loss: a deep quadruplet network for person re-identification [J].

Chen, Weihua ;

Chen, Xiaotang ;

Zhang, Jianguo ;

Huang, Kaiqi .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1320-1329

[7] Person Re-Identification by Camera Correlation Aware Feature Augmentation [J].

Chen, Ying-Cong ;

Zhu, Xiatian ;

Zheng, Wei-Shi ;

Lai, Jian-Huang .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (02) :392-408

[8] Kernel Pooling for Convolutional Neural Networks [J].

Cui, Yin ;

Zhou, Feng ;

Wang, Jiang ;

Liu, Xiao ;

Lin, Yuanqing ;

Belongie, Serge .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3049-3058

[9] Bilinear Attention Networks for Person Retrieval [J].

Fang, Pengfei ;

Zhou, Jieming ;

Roy, Soumava Kumar ;

Petersson, Lars ;

Harandi, Mehrtash .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8029-8038

[10] Object Detection with Discriminatively Trained Part-Based Models [J].

Felzenszwalb, Pedro F. ;

Girshick, Ross B. ;

McAllester, David ;

Ramanan, Deva .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645

← 1 2 3 4 5 6 7 8 9 10 →