Few-Shot Object Detection with Fully Cross-Transformer

被引:134
作者
Han, Guangxing [1 ]
Ma, Jiawei [1 ]
Huang, Shiyuan [1 ]
Chen, Long [1 ]
Chang, Shih-Fu [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.00525
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot object detection (FSOD), with the aim to detect novel objects using very few training examples, has recently attracted great research interest in the community. Metric-learning based methods have been demonstrated to be effective for this task using a two-branch based siamese network, and calculate the similarity between image regions and few-shot examples for detection. However, in previous works, the interaction between the two branches is only restricted in the detection head, while leaving the remaining hundreds of layers for separate feature extraction. Inspired by the recent work on vision transformers and vision-language transformers, we propose a novel Fully Cross-Transformer based model (FCT) for FSOD by incorporating cross-transformer into both the feature backbone and detection head. The asymmetric-batched cross-attention is proposed to aggregate the key information from the two branches with different batch sizes. Our model can improve the few-shot similarity learning between the two branches by introducing the multi-level interactions. Comprehensive experiments on both PASCAL VOC and MSCOCO FSOD benchmarks demonstrate the effectiveness of our model.
引用
收藏
页码:5311 / 5320
页数:10
相关论文
共 53 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]   Adaptive Image Transformer for One-Shot Object Detection [J].
Chen, Ding-Jie ;
Hsieh, He-Yen ;
Liu, Tyng-Luh .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12242-12251
[3]  
Chen SY, 2019, 2019 IEEE INTERNATIONAL CONFERENCE ON AGENTS (ICA), P21, DOI 10.1109/AGENTS.2019.8929212
[4]   Special Topic: Novel Sensing Materials and Their Applications in Analytical Chemistry [J].
Chen, Xi ;
Guo, Zhiyong .
JOURNAL OF ANALYSIS AND TESTING, 2021, 5 (01) :1-2
[5]  
Chu X., 2021, arXiv preprint arXiv:2104.13840, V1, P3
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]  
Doersch C., 2020, P ADV NEUR INF PROC, P21981
[8]  
Dosovitskiy A, 2020, ARXIV
[9]   Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector [J].
Fan, Qi ;
Zhuo, Wei ;
Tang, Chi-Keung ;
Tai, Yu-Wing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4012-4021
[10]  
Finn C, 2017, PR MACH LEARN RES, V70