DENSE CONTRASTIVE LEARNING BASED OBJECT DETECTION FOR REMOTE SENSING IMAGES

被引:1
作者
Liu, Shuo [1 ]
Zou, Huanxin [1 ]
Li, Meilin [1 ]
Cao, Xu [1 ]
He, Shitian [1 ]
Wei, Juan [1 ]
Sun, Li [1 ]
Zhang, Yuqing [1 ]
机构
[1] Natl Univ Def Technol, Coll Elect Sci & Technol, Changsha 410073, Peoples R China
来源
IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM | 2023年
基金
中国国家自然科学基金;
关键词
self-supervised learning; object detection; contrastive learning; patch-level feature; dense visual representation;
D O I
10.1109/IGARSS52108.2023.10282252
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Supervised learning based object detectors suffer from the high cost and difficulty of labeling datasets. Self- supervised learning methods require no manual annotations. However, the misalignment between the pretext task designed for image classification and the downstream task affects the detection performance. Therefore, this paper proposes a self-supervised dense contrastive learning method to improve performance of object detection in remote sensing images. Specifically, first, Swin Transformer substitutes popular CNN to extract features of augmented multiple views. Second, global and local features are extracted using parallel global and dense projector heads, respectively. Third, a predictor head is added to increase the nonlinear transformations in the network. Extensive experiments on the NWPU VHR-10 dataset show that the proposed method outperforms two representative strong baseline methods, including MoCoV2 and DenseCL.
引用
收藏
页码:6458 / 6461
页数:4
相关论文
共 15 条
[1]   Multi-class geospatial object detection and geographic image classification based on collection of part detectors [J].
Cheng, Gong ;
Han, Junwei ;
Zhou, Peicheng ;
Guo, Lei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 98 :119-132
[2]  
Doersch C., 2015, Unsupervised Visual Representation Learning by Context Prediction, P1422, DOI [DOI 10.1109/ICCV.2015.1672-S2.0-84973916088, DOI 10.1109/ICCV.2015.167]
[3]  
Grill Jean-Bastien., 2020, ADV NEUR IN, V33, P21271
[4]   Masked Autoencoders Are Scalable Vision Learners [J].
He, Kaiming ;
Chen, Xinlei ;
Xie, Saining ;
Li, Yanghao ;
Dollar, Piotr ;
Girshick, Ross .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :15979-15988
[5]   Momentum Contrast for Unsupervised Visual Representation Learning [J].
He, Kaiming ;
Fan, Haoqi ;
Wu, Yuxin ;
Xie, Saining ;
Girshick, Ross .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9726-9735
[6]  
Liu S., 2020, Self-EMD: Self-supervised object detection without imagenet
[7]  
Noroozi Mehdi., 2016, Unsupervised learning of visual representations by solving jigsaw puzzles
[8]  
Ting Chen S. K. M. N., 2020, ICML 2020
[9]  
van den Oord Aaron., 2018, CoRR abs/1807.03748, DOI 10.48550/arxiv.1807.03748
[10]  
WANG XL, 2021, DENSE CONTRASTIVE LE, P3023, DOI DOI 10.1109/CVPR46437.2021.00304