Remote Sensing Image Scene Classification with Self-Supervised Learning Based on Partially Unlabeled Datasets

被引:12
作者
Chen, Xiliang [1 ]
Zhu, Guobin [1 ]
Liu, Mingqing [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金;
关键词
self-supervised learning; vision transformer; random mask; remote sensing image scene classification; unlabeled datasets; CONVOLUTIONAL NEURAL-NETWORKS; OBJECT DETECTION; SCALE; REPRESENTATION; ATTENTION;
D O I
10.3390/rs14225838
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In recent years, supervised learning, represented by deep learning, has shown good performance in remote sensing image scene classification with its powerful feature learning ability. However, this method requires large-scale and high-quality handcrafted labeled datasets, which leads to a high cost of obtaining annotated samples. Self-supervised learning can alleviate this problem by using unlabeled data to learn the image's feature representation and then migrate to the downstream task. In this study, we use an encoder-decoder structure to construct a self-supervised learning architecture. In the encoding stage, the image mask is used to discard some of the image patches randomly, and the image's feature representation can be learned from the remaining image patches. In the decoding stage, the lightweight decoder is used to recover the pixels of the original image patches according to the features learned in the encoding stage. We constructed a large-scale unlabeled training set using several public scene classification datasets and Gaofen-2 satellite data to train the self-supervised learning model. In the downstream task, we use the encoder structure with the masked image patches that have been removed as the backbone network of the scene classification task. Then, we fine-tune the pre-trained weights of self-supervised learning in the encoding stage on two open datasets with complex scene categories. The datasets include NWPU-RESISC45 and AID. Compared with other mainstream supervised learning methods and self-supervised learning methods, our proposed method has better performance than the most state-of-the-art methods in the task of remote sensing image scene classification.
引用
收藏
页数:23
相关论文
共 76 条
[71]   Remote Sensing Image Scene Classification Based on an Enhanced Attention Module [J].
Zhao, Zhicheng ;
Li, Jiaqi ;
Luo, Ze ;
Li, Jian ;
Chen, Can .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (11) :1926-1930
[72]   When Self-Supervised Learning Meets Scene Classification: Remote Sensing Scene Classification Based on a Multitask Learning Framework [J].
Zhao, Zhicheng ;
Luo, Ze ;
Li, Jian ;
Chen, Can ;
Piao, Yingchao .
REMOTE SENSING, 2020, 12 (20) :1-22
[73]   PatternNet: A benchmark dataset for performance evaluation of remote sensing image retrieval [J].
Zhou, Weixun ;
Newsam, Shawn ;
Li, Congmin ;
Shao, Zhenfeng .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 145 :197-209
[74]   Adaptive Deep Sparse Semantic Modeling Framework for High Spatial Resolution Image Scene Classification [J].
Zhu, Qiqi ;
Zhong, Yanfei ;
Zhang, Liangpei ;
Li, Deren .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (10) :6180-6195
[75]   Bag-of-Visual-Words Scene Classifier With Local and Global Features for High Spatial Resolution Remote Sensing Imagery [J].
Zhu, Qiqi ;
Zhong, Yanfei ;
Zhao, Bei ;
Xia, Gui-Song ;
Zhang, Liangpei .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (06) :747-751
[76]   Deep Learning Based Feature Selection for Remote Sensing Scene Classification [J].
Zou, Qin ;
Ni, Lihao ;
Zhang, Tong ;
Wang, Qian .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2015, 12 (11) :2321-2325