Dual-Awareness Attention for Few-Shot Object Detection

被引：66

作者：

Chen, Tung-, I ^{[1
]}

Liu, Yueh-Cheng ^{[1
]}

Su, Hung-Ting ^{[1
]}

Chang, Yu-Cheng ^{[1
]}

Lin, Yu-Hsiang ^{[1
]}

Yeh, Jia-Fong ^{[1
]}

Chen, Wen-Chin ^{[1
]}

Hsu, Winston H. ^{[1
,2
]}

机构：

[1] Natl Taiwan Univ, Taipei 106, Taiwan

[2] Mobile Drive Technol, Taipei 236, Taiwan

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

关键词：

Feature extraction; Object detection; Detectors; Correlation; Task analysis; Power capacitors; Adaptation models; Deep learning; object detection; visual attention; few-shot object detection; NETWORKS;

D O I：

10.1109/TMM.2021.3125195

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

While recent progress has significantly boosted few-shot classification (FSC) performance, few-shot object detection (FSOD) remains challenging for modern learning systems. Existing FSOD systems follow FSC approaches, ignoring critical issues such as spatial variability and uncertain representations, and consequently result in low performance. Observing this, we propose a novel Dual-Awareness Attention (DAnA) mechanism that enables networks to adaptively interpret the given support images. DAnA transforms support images into query-position-aware (QPA) features, guiding detection networks precisely by assigning customized support information to each local region of the query. In addition, the proposed DAnA component is flexible and adaptable to multiple existing object detection frameworks. By adopting DAnA, conventional object detection networks, Faster R-CNN and RetinaNet, which are not designed explicitly for few-shot learning, reach state-of-the-art performance in FSOD tasks. In comparison with previous methods, our model significantly increases the performance by 47% (+6.9 AP), showing remarkable ability under various evaluation settings.

引用

页码：291 / 301

页数：11

共 55 条

[31] Feature Pyramid Networks for Object Detection [J].

Lin, Tsung-Yi ;

Dollar, Piotr ;

Girshick, Ross ;

He, Kaiming ;

Hariharan, Bharath ;

Belongie, Serge .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944

[32]

Liu JL, 2020, Img Proc Comp Vis Re, V12346, P741, DOI 10.1007/978-3-030-58452-8_43

[33] CRNet: Cross-Reference Networks for Few-Shot Segmentation [J].

Liu, Weide ;

Zhang, Chi ;

Lin, Guosheng ;

Liu, Fayao .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4164-4172

[34]

Luong MT, 2015, Arxiv, DOI arXiv:1508.04025

[35]

Nichol A, 2018, Arxiv, DOI arXiv:1803.02999

[36]

Perez-Rua Juan-Manuel, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P13843, DOI 10.1109/CVPR42600.2020.01386

[37]

Ravi Sachin, 2017, P INT C LEARN REPR

[38] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].

Ren, Shaoqing ;

He, Kaiming ;

Girshick, Ross ;

Sun, Jian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149

[39]

Snell J, 2017, ADV NEUR IN, V30

[40]

Socher R., 2013, ADV NEURAL INFORM PR, P935

← 1 2 3 4 5 6 →