Few-Shot Object Detection via Dual-Domain Feature Fusion and Patch-Level Attention

被引:0
作者
Ren, Guangli [1 ,2 ]
Liu, Jierui [1 ,2 ]
Wang, Mengyao [1 ,2 ]
Guan, Peiyu [1 ,2 ]
Cao, Zhiqiang [1 ,2 ]
Yu, Junzhi [3 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Peking Univ, Dept Adv Mfg & Robot, Beijing 100871, Peoples R China
来源
TSINGHUA SCIENCE AND TECHNOLOGY | 2025年 / 30卷 / 03期
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Training; Visualization; Adaptation models; Head; Accuracy; Diversity reception; Object detection; Feature extraction; few-shot object detection; dual-domain feature fusion; patch-level attention; CHALLENGES;
D O I
10.26599/TST.2024.9010031
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Few-shot object detection receives much attention with the ability to detect novel class objects using limited annotated data. The transfer learning-based solution becomes popular due to its simple training with good accuracy, however, it is still challenging to enrich the feature diversity during the training process. And fine-grained features are also insufficient for novel class detection. To deal with the problems, this paper proposes a novel few-shot object detection method based on dual-domain feature fusion and patch-level attention. Upon original base domain, an elementary domain with more category-agnostic features is superposed to construct a two-stream backbone, which benefits to enrich the feature diversity. To better integrate various features, a dual-domain feature fusion is designed, where the feature pairs with the same size are complementarily fused to extract more discriminative features. Moreover, a patch-wise feature refinement termed as patch-level attention is presented to mine internal relations among the patches, which enhances the adaptability to novel classes. In addition, a weighted classification loss is given to assist the fine-tuning of the classifier by combining extra features from FPN of the base training model. In this way, the few-shot detection quality to novel class objects is improved. Experiments on PASCAL VOC and MS COCO datasets verify the effectiveness of the method.
引用
收藏
页码:1237 / 1250
页数:14
相关论文
共 39 条
[1]  
Andrychowicz M, 2016, ADV NEUR IN, V29
[2]  
Chen WY, 2020, Arxiv, DOI arXiv:1904.04232
[3]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[5]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149
[6]   Grasp Detection with Hierarchical Multi-Scale Feature Fusion and Inverted Shuffle Residual [J].
Geng, Wenjie ;
Cao, Zhiqiang ;
Guan, Peiyu ;
Jing, Fengshui ;
Tan, Min ;
Yu, Junzhi .
TSINGHUA SCIENCE AND TECHNOLOGY, 2024, 29 (01) :244-256
[7]   CFA: Constraint-based Finetuning Approach for Generalized Few-Shot Object Detection [J].
Guirguis, Karim ;
Hendawy, Ahmed ;
Eskandar, George ;
Abdelsamad, Mohamed ;
Kayser, Matthias ;
Beyerer, Juergen .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :4048-4058
[8]   HSPOG: An Optimized Target Recognition Method Based on Histogram of Spatial Pyramid Oriented Gradients [J].
Guo, Shaojun ;
Liu, Feng ;
Yuan, Xiaohu ;
Zou, Chunrong ;
Chen, Li ;
Shen, Tongsheng .
TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (04) :475-483
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection [J].
Hu, Hanzhe ;
Bai, Shuai ;
Li, Aoxue ;
Cui, Jinshi ;
Wang, Liwei .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :10180-10189