FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding

被引:343
作者
Sun, Bo [1 ]
Li, Banghuai [2 ]
Cai, Shengcai [2 ]
Yuan, Ye [2 ]
Zhang, Chi [2 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90089 USA
[2] MEGVII Technol, Beijing, Peoples R China
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR46437.2021.00727
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emerging interests have been brought to recognize previously unseen objects given very few training examples, known as few-shot object detection (FSOD). Recent researches demonstrate that good feature embedding is the key to reach favorable few-shot learning performance. We observe object proposals with different Intersection-of-Union (IoU) scores are analogous to the intra-image augmentation used in contrastive visual representation learning. And we exploit this analogy and incorporate supervised contrastive learning to achieve more robust objects representations in FSOD. We present Few-Shot object detection via Contrastive proposals Encoding (FSCE), a simple yet effective approach to learning contrastive-aware object proposal encodings that facilitate the classification of detected objects. We notice the degradation of average precision (AP) for rare objects mainly comes from misclassifying novel instances as confusable classes. And we ease the misclassification issues by promoting instance level intraclass compactness and inter-class variance via our contrastive proposal encoding loss (CPE loss). Our design outperforms current state-of-the-art works in any shot and all data splits, with up to +8.8% on standard benchmark PASCAL VOC and +2.7% on challenging COCO bench-mark. Code is available at: https://github.com/MegviiDetection/FSCE.
引用
收藏
页码:7348 / 7358
页数:11
相关论文
共 56 条
[1]  
[Anonymous], 2010, International journal of computer vision, DOI DOI 10.1007/s11263-009-0275-4
[2]  
[Anonymous], 2020, ADV NEURAL INFORM PR
[3]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[4]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[5]  
Chen Hao, 2018, AAAI
[6]  
Chen T, 2020, PR MACH LEARN RES, V119
[7]  
Chen W.Y., 2019, The Fractional Laplacian
[8]   Knowledge-guided Deep Reinforcement Learning for Interactive Recommendation [J].
Chen, Xiaocong ;
Huang, Chaoran ;
Yao, Lina ;
Wang, Xianzhi ;
Liu, Wei ;
Zhang, Wenjie .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[9]   Learning a similarity metric discriminatively, with application to face verification [J].
Chopra, S ;
Hadsell, R ;
LeCun, Y .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :539-546
[10]  
Chuang C.-Y., 2020, ADV NEURAL INFORM PR