Visual-Semantic Alignment for Few-shot Remote Sensing Scene Classification

被引:0
作者
Li, Haojun [1 ]
Li, Linjia [1 ]
Luo, Wei [1 ]
机构
[1] South China Agr Univ, Pazhou Lab, Guangzhou, Peoples R China
来源
2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024 | 2024年
关键词
Remote sensing scene classification; Few-shot learning; Self-supervised learning; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.1145/3651671.3651680
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a few-shot learning approach that aligns visual and semantic features in an embedding feature space to alleviate the shortage of training (or reference) data in remote sensing scene classification (RSSC). Specifically, the self-supervised learning is first employed to improve the expressive ability of the learned feature, which could effectively enhance the features' generalizability. Meanwhile, we align the image feature and its corresponding class-semantic feature, which is obtained by feeding the class name to a language model such as BERT, to increase the image feature's discriminability. By systematically integrating the self-supervised learning and visual-semantic alignment with the backbone network, our approach could achieve image features with good generalizability and discriminability. Experiments on UCMerced LandUse, NWPU-RESISC45, and AID benchmarks validate the feasibility of our approach and verify its improved few-shot classification performance in RSSC.
引用
收藏
页码:411 / 417
页数:7
相关论文
共 38 条
[1]  
Alajaji D, 2020, 2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), P81, DOI [10.1109/m2garss47143.2020.9105154, 10.1109/M2GARSS47143.2020.9105154]
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934, 10.48550/arXiv.2004.10934]
[3]  
Chen WY, 2020, Arxiv, DOI [arXiv:1904.04232, DOI 10.48550/ARXIV.1904.04232]
[4]   Remote Sensing Image Scene Classification: Benchmark and State of the Art [J].
Cheng, Gong ;
Han, Junwei ;
Lu, Xiaoqiang .
PROCEEDINGS OF THE IEEE, 2017, 105 (10) :1865-1883
[5]   Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images [J].
Cheng, Gong ;
Han, Junwei ;
Guo, Lei ;
Liu, Zhenbao ;
Bu, Shuhui ;
Ren, Jinchang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (08) :4238-4249
[6]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
[7]   Boosting Few-Shot Visual Learning with Self-Supervision [J].
Gidaris, Spyros ;
Bursuc, Andrei ;
Komodakis, Nikos ;
Perez, Patrick ;
Cord, Matthieu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8058-8067
[8]   Dynamic Few-Shot Visual Learning without Forgetting [J].
Gidaris, Spyros ;
Komodakis, Nikos .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4367-4375
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Remote Sensing Scene Classification Using Multilayer Stacked Covariance Pooling [J].
He, Nanjun ;
Fang, Leyuan ;
Li, Shutao ;
Plaza, Antonio ;
Plaza, Javier .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (12) :6899-6910